tXFKhi>b NAIL NU: EL485954545US 

METHODS OF IDENTIFYING MODULATORS OF BROMODOMAINS 

FIELD OF THE INVENTION 

The present invention provides the three-dimensional structure of a histone 
acetyltransferase bromodomain. The three-dimensional structural information is 
included in the invention. The present invention also identifies for the first time, that 
bromodomains can bind to an acetylated binding partners. The interaction between 
bromodomains and their binding partners play a crucial role in various cellular 
functions, including in the regulation/modulation of DNA transcription. Therefore, 
the present invention provides procedures for identifying agents that can modulate the 
interaction of bromodomains and their binding partners by high throughput drug 
screening and/or through the use of rational drug design based on the three- 
dimensional data provided herein. 

BACKGROUND OF THE INVENTION 

In recent years great strides have been made in the elucidation of the steps involved in 
intercellular and intracellular signaling. Indeed, the individual steps of the cascade of 
events involved in a number of cellular signal transduction processes have been 
determined. For example, intercellular signal transduction generally begins with an 
intercellular ligand binding the extracellular portion of a receptor of the plasma 
membrane. The bound receptor then either directly or indirectly initiates the 
activation of one or more cellular factors. An activated cellular factor may act as 
transcription factor by entering the nucleus to interact with its corresponding genomic 
response element, or alternatively, it may interact with other cellular factors 
depending on the complexity of the process. In either case, one or more transcription 
factors ultimately bind to one or more specific genomic response elements. This 
binding plays a crucial role in the up and/or down regulation of the transcription of 
the specific genes that are under the control of these genomic response elements. 
However, the process of re-organizing the chromatin of eukaryotic cells, which is a 
prerequisite for the binding of the transcription factor to the genomic response 
elements, has remained a mystery. 
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Chromatin contains several highly conserved histone proteins including: H3, H4, 
H2A, H2B, and HI. These histone proteins package eukaryotic DNA into repeating 
nucleosomal units that are folded into higher-order chromatin fibers [Luger and 
5 Richmond, Curr. Opin, Genet. Dev. 8:140-146 (1998)]. A portion of the histone that 
comprises roughly a quarter of the protein protrudes from the chromatin surface, and 
is thereby sensitive to proteolytic enzymes [van Holde, in Chromatin (Rich, A,, ed.. 
Springer, New York ) pagesl 1 1-148 (1988); Hect et al.. Cell 80:583-592 (1995)]. 
This portion of the histone is known as the "histone tail". Histone tails tend to be free 
10 for protein-protein interaction, and are also the portion of the histone most prone to 
post-translational modification. Such post-translational modification includes 
acetylation, phosphorylation, methylation, ubiquitination, and ADP-ribosylation [van 
Holde, in Chromatin (Rich, A,, ed,. Springer, New York ) pagesl 1 1-148 (1988)]. 

15 Of all classes of proteins, histones are amongst the most susceptible to post- 
translational modification. Perhaps the best studied post-translational modification of 
histones is the acetylation of specific lysine residues [Grunstin, M., Nature, 389:349- 
352 (1997)]. Indeed, acetylation of histone lysine residues has been suggested to 
play a pivotal role in chromatin remodeling and gene activation. Consistently, 

20 distinct classes of enzymes, namely histone acetyltransferases (HATs) and histone 
deacetylases (HDACs), acetylate or de-acetylate specific histone lysine residues 
[Struhl, Genes Dev. 12:599-606(1998)]. 

Nearly all known nuclear HATs contain an approximately 110 amino acid sequence 
25 known as the bromodomain [Jeanmougin et aL, Trends in Biochemical Sciences, 
22:151-153 (1997)], a protein motif that was initially discovered in Drosophila 
brahma protein. Bromodomains are found in a large number of chromatin-associated 
proteins and have now been identified in approximately 40 proteins, often adjacent to 
other protein motifs [Jeanmougin et al. Trends in Biochemical Sciences, 22:151-153 
30 (1997); Tamkun et al.. Cell, 68:561-572 (1992): Hanes et al. Nucleic Acids Research, 
20:2603 (1992)]. Proteins that contain a bromodomain often contain a second 
bromodomain. However, despite the wide occurrence of bromodomains and their 



likely role in chromatin regulation, their three-dimensional structure and binding 
partners heretofore have remained unknown. 

Therefore, there is a need to identify a binding partner for a bromodomain. In 
addition, there is a need to identify agonists or antagonists to the bromodomain- 
binding partner complex. Since a preferred method of drug-screening relies on 
structure based drug design, there is also a need to determine the three-dimensional 
structure of a bromodomain. In this case, once the three dimensional structure of 
bromodomain is determined, potential agonists and/or potential antagonists can be 
designed with the aid of computer modeling [Bugg et aL, Scientific American, 
Dec.:92-98 (1993); West et al„ TIPS, 16:67-74 (1995); Dunbrack et aL, Folding & 
Design, 2:27-42 (1997)]. However, heretofore the three-dimensional structure of the 
bromodomain has remained unknown. Therefore, there is a need for obtaining a form 
of the bromodomain that is amenable for NMR analysis and/or X-ray crystallographic 
analysis. Furthermore, there is a need for the determination of the three-dimensional 
structure of the bromodomain. Finally, there is a need for procedures for related 
structural based drug design predicated on such structural data. 

The citation of any reference herein should not be construed as an admission that such 
reference is available as "Prior Art" to the instant application. 

SUMMARY OF THE INVENTION 

The present invention provides, for the first time, that bromodomains bind to acetyl- 
lysine residues of proteins. The present invention also provides the three-dimensional 
structure of a bromodomain as well as the three-dimensional structure of a 
bromodomain-acetyl-histamine complex. The structural information provided can be 
employed in methods of identifying drugs that can modulate the cellular processes 
that involve bromodomain-acetyl-lysine interactions. These interactions include 
chromatin remodeling, which is a required step in eukaryotic transcription. In a 
particular embodiment, the three-dimensional structural information is used in the 
design of an inhibitor of leukemia. 



The present invention provides an isolated nucleic acid that encodes a peptide 
consisting of about 21 to 40 amino acids that comprises a ZA loop of a bromodomain. 
In a preferred embodiment the peptide comprises about 23 to 34 amino acids. The 
isolated nucleic acid can further comprise a heterologous nucleotide sequence. 

In a preferred embodiment the peptide comprises the amino acid sequence of SEQ ID 
NO:3. In another embodiment the peptide comprises the amino acid sequence of SEQ 
ID NO:43. In particular embodiments the ZA loop is obtained from the bromodomain 
having the amino acid sequence of SEQ ID NO:7, or SEQ ID NO:8, or SEQ ID NO:9, 
or SEQ ID NO: 10, or SEQ ID NO:l 1, or SEQ ID NO: 12, or SEQ ID NO: 13, or SEQ 
ID NO: 14, or SEQ ID NO: 15, or SEQ ID NO: 16, or SEQ ID NO: 17, or SEQ ID 
NO: 18, or SEQ ID NO: 19, or SEQ ID NO:20, or SEQ ID NO:21,or SEQ ID NO: 22, 
or SEQ ID NO:23, or SEQ ID NO:24, or SEQ ID NO:25, or SEQ ID NO:26, or SEQ 
ID NO:27, or SEQ ID NO:28, or SEQ ID NO:29, or SEQ ID NO:30, or SEQ ID NO: 
or SEQ ID NO:31, or SEQ ID NO:32,or SEQ ID NO: 33, or SEQ ID NO:34, or SEQ 
ID NO:35, or SEQ ID NO:36 , or SEQ ID NO:37, or SEQ ID NO:38, or SEQ ID NO: 
or SEQ ID NO:39, or SEQ ID NO:40, or SEQ ID NO:41, or SEQ ID NO:42. 

The present invention further provides a recombinant DNA molecule that comprises 
an isolated nucleic acid of the present invention, as described above, with or without a 
heterologous nucleotide sequence. Such a recombinant DNA molecule can be 
operatively linked to an expression control sequence and can be part of an expression 
vector. The present invention further provides a cell that comprises such an 
expression vector. The cell can be either a eukaryotic or a prokaryotic cell. The 
present invention further provides a method of expressing the peptides of the present 
invention or fragments thereof in this cell. One such method comprises culturing the 
cell in an appropriate cell culture medium under conditions that provide for 
expression of the peptide by the cell. 

The present invention further provides a peptide consisting of about 21 to 40 amino 
acids that comprises a ZA loop of a bromodomain. In a preferred embodiment the 



peptide comprises about 23 to 34 amino acids. The present invention also provides 
fusion proteins or peptides comprising these peptides. 

In a preferred embodiment the peptide comprises the amino acid sequence of SEQ ID 
NO:3. In another embodiment the peptide comprises the amino acid sequence of SEQ 
ID NO:43. In particular embodiments the ZA loop is obtained from the bromodomain 
having the amino acid sequence of SEQ ID NO:7, or SEQ ID NO:8, or SEQ ED NO:9, 
or SEQ ID NO: 10, or SEQ ID NO:l 1, or SEQ ID NO: 12, or SEQ ID NO: 13, or SEQ 
ID NO: 1 4, or SEQ ID NO: 1 5, or SEQ ID NO: 1 6, or SEQ ID NO: 1 7, or SEQ ED 
NO: 18, or SEQ ID NO: 19, or SEQ ID NO:20, or SEQ ID NO:21,or SEQ ID NO: 22, 
or SEQ ID NO:23, or SEQ ID NO:24, or SEQ ID NO:25, or SEQ ID NO:26, or SEQ 
ID NO:27, or SEQ ID NO:28, or SEQ ID NO:29, or SEQ ID NO:30, or SEQ ID NO: 
or SEQ ID NO:31, or SEQ ED NO:32,or SEQ ID NO: 33, or SEQ ID NO:34, or SEQ 
ID NO:35, or SEQ ID NO:36 , or SEQ ID NO:37, or SEQ ID NO:38, or SEQ ID NO: 
or SEQ ID NO:39, or SEQ ID NO:40, or SEQ ID NO:41, or SEQ ID NO:42. 

The present invention also provides antibodies raised against the peptides/proteins of 
the present invention, or raised against an antigenic fragment of these 
proteins/fragments. In a particular embodiment an antibody is raised against a 
fragment of the ZA loop of a bromodomain. In another embodiment an antibody is 
raised against a fragment of a protein or peptide that comprises an acetyl-lysine, 
wherein the protein or peptide can bind to a bromodomain. Such fragments can be 
conjugated to a carrier protein or be part of a fiision protein. In one embodiment the 
antibody is a polyclonal antibody. In another embodiment, the antibody is a 
monoclonal antibody. A hybridoma that makes the monoclonal antibody is also part 
of the present invention. In a particular embodiment the antibody is a chimeric 
antibody. Antibodies that can specifically recognize acetyl-lysine residues involved 
bromodomain binding are also part of the present invention. 

In another aspect of the present invention a method is provided for identifying a 
compound that modulates the affinity of a bromodomain for a ligand (and/or protein) 
that comprises an acetylated lysine. One such embodiment comprises contacting the 



bromodomain and the ligand in the presence of a compound under conditions that , 
the bromodomain and the Ugand bind in the absence of the compoimd. The affinity of 
the bromodomain for the hgand is then determined (e.g., measured). A compound is 
identified as a compound that modulates the affinty of the bromodomain for the 
ligand when there is a change in the affinity of the bromodomain for the ligand in the 
presence of the compound. When the affinity of the bromodomain for the hgand 
increases in the presence of the compound, the compound is identified as a promoting 
agent for the bromodomain-ligand complex. When the affinity of the bromodomain 
for the ligand decreases in the presence of the compound, the compound is identified 
as an inhibitor of the bromodomain-ligand complex. In a preferred embodiment, the 
compound to be tested is pre-selected by performing rational drug design with the set 
of atomic coordinates obtained fi-om one or more of Tables 1-6. More preferably the 
selecting is performed in conjunction with computer modeling. In a particular 
embodiment, the compound is selected by performing rational drug design with the 
set of atomic coordinates obtained firom a set of atomic coordinates defining the three- 
dimensional structure of a bromodomain consisting of the amino acid sequence of 
SEQ ID NO:7 alone or with acetyl-histamine. 

The present invention also provides a method of identifying a compound that 
modulates the stability of a bromodomain-acetyl-lysine binding complex. One such 
embodiment comprises contacting the bromodomain-acetyl-lysine binding complex in 
the presence of the compound under conditions in which the bromodomain-acetyl- 
lysine binding complex forms in the absence of the compound. The stabiUty of the 
bromodomain-acetyl-lysine binding complex is then determined (e.g., measured). A 
compound is identified as a compound that modulates the stability of the 
bromodomain-acetyl-lysine binding complex, when there is a change in the stability 
of the bromodomain-acetyl-lysine binding complex in the presence of that compound. 
When the stability of the bromodomain-acetyl-lysine binding complex increases in the 
presence of the compound, the compound is identified as a stabiUzing agent. When 
the stability of the bromodomain-acetyl-lysine binding complex decreases in the 
presence of the compound, the compound is identified as an inhibitor. In a preferred 
embodiment, the compound to be tested is pre-selected by performing rational drug 



design with the set of atomic coordinates obtained from one or more of Tables 1-6. 
More preferably the selecting is performed in conjunction with computer modeling. 
In a particular embodiment, the compound is selected by performing rational drug 
design with the set of atomic coordinates obtained from a set of atomic coordinates 
defining the three-dimensional structure of a bromodomain consisting of the amino 
acid sequence of SEQ ID NO:7 alone or with acetyl-histamine. 

As anyone having skill in the art of drug development would readily understand, the 
potential drugs selected by the above methodologies can be refined by re-testing in 
appropriate drug assays, including those disclosed herein. Chemical analogs of such 
potential drugs can be obtained (either through chemical synthesis or drug libraries) 
and be analogously tested. Therefore, methods comprising successive iterations of the 
steps of the individual drug assays, as exemplified herein, using either repetitive or 
different binding studies, or transcription activation studies or other such studies are 
envisioned in the present invention. In addition, potential drugs may be identified 
first by rapid throughput drug screening, as described below, prior to performing 
computer modeling on a potential drug using the three-dimensional structure of the 
bromodomain. 

The present invention further comprises all of the potential, selected, and putative 
compounds (drugs) identified by the methods of the present invention, as well as the 
final drugs themselves identified with the methods of the present invention. 

The present invention further provides a method for identifying a potential binding 
partner for a protein (e.g., a histone) comprising an acetyl- lysine. One such 
embodiment comprises contacting the protein with a polypeptide comprising a 
bromodomain. In a preferred embodiment the bromodomain comprises the amino 
acid sequence of SEQ ID NO:3. In particular embodiments the bromodomain has the 
amino acid sequence of SEQ ID NO:7, or SEQ ID NO:8, or SEQ ID NO:9, or SEQ ID 
NO: 10, or SEQ ID NO:l 1, or SEQ ID NO: 12, or SEQ ID NO: 13, or SEQ ID NO: 14, 
or SEQ ID NO:15, or SEQ ID NO:16, or SEQ ID NO:17, or SEQ ID NO:18, or SEQ 
ID NO:19, or SEQ ID NO:20, or SEQ ID NO:21,or SEQ ID NO: 22, or SEQ ID 
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NO:23, or SEQ ID NO:24, or SEQ ID NO:25, or SEQ ID NO:26, or SEQ ID NO:27, 
or SEQ ID NO:28, or SEQ ID NO:29, or SEQ ID NO:30, or SEQ ID NO: or SEQ ID 
NO:31, or SEQ ID NO:32,or SEQ ID NO: 33, or SEQ ED NO:34, or SEQ ID NO:35, 
or SEQ ID NO:36 , or SEQ ID NO:37, or SEQ ID NO:38, or SEQ ID NO: or SEQ ID 
NO:39, or SEQ ID NO:40, or SEQ ID NO:41, or SEQ ID NO:42. 

The present invention further provides a method for identifying a protein having a 
bromodomain. One such embodiment comprises contacting a cellular extract with a 
peptide comprising an acetyl-lysine. 

The present invention further provides agents that can inhibit the binding of a 
bromodomain with a protein comprising an acetyl-lysine. In one embodiment the 
agent is ISYGR-^c/T-KRRQRR (SEQ ID NO:4), In another embodiment the agent is 
ARKSTGG-Ac/:-APRKQL (SEQ ID NO:5). In still another embodiment the agent 
is QSTSRHK-.4c/i:-LMFKTE (SEQ ID NO:6). In yet another embodiment the agent 
is an analog of acetyl-lysine such as acetyl-histamine. In still another embodiment the 
agent is an antibody that recognizes an acetyl-lysine of a protein binding partner of a 
bromodomain. In a preferred embodiment the agent is an antibody raised against a 
ZA loop of a bromodomain. These agents can be used as pharmaceuticals in 
compositions that contain a pharmaceutical ly acceptable carrier for example, or in the 
various drug assays of the present invention, serving as controls to demonstrate 
specificity. 

Accordingly, it is a principal object of the present invention to provide the three- 
dimensional coordinates of a bromodomain. 

It is a further object of the present invention to provide the three-dimensional 
coordinates of a bromodomain complexed with acetyl-histamine. 

It is a further object of the present invention to provide an assay for identifying 
proteins that contain bromodomains that bind proteins that comprise acetyl-lysine. 
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It is a further object of the present invention to provide methods of identifying drugs 
that can modulate the bromodomain-acetyl-lysine binding complex. 

It is a further object of the present invention to provide methods of identifying drugs 
5 that can inhibit the binding of a bromodomain to a protein containing acetyl-lysine. 

It is a further object of the present invention to provide methods that incorporate the 
use of rational design for identifying such drugs. 

10 It is a further object of the present invention to provide a method of identifying drugs 
that can treat leukemia. 

It is a further object of the present invention to provide a method of identifying drugs 
that can treat and/or prevent AIDS. 

15 

These and other aspects of the present invention will be better appreciated by 
reference to the following drawings and Detailed Description. 

BRIEF DESCRIPTION OF THE DRAWINGS 

20 

Figure 1. Structure-based sequence alignment of a selected number of bromodomains. 
The sequences were aligned based on the NMR-derived structure of the P/CAF 
bromodomain, and the predicated four a-helices are shown in green boxes. 
Bromodomains are grouped on the basis of the sequence and/or functional similarities 

25 as described by Jeanmougin et al, [Trends in Biochemical Sciences, 22:151-153 
(1997)], Residue numbers of the P/CAF bromodomain are indicated above its 
sequence. Three absolutely conserved residues, corresponding to Pro751, Pro767, and 
Asn803 in the P/CAF bromodomain, are shown in red. Highly conserved residues are 
colored in blue. The residues of the P/CAF bromodomain that interact with 

30 acetyl-histamine, as determined by intermolecular NOEs, are indicated by asterisks. 
The ZA loop, which is critical for acetyl-lysine binding, for each of the indicated 
bromodomains is also identified. The underlined residues were changed individually 
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by site-directed mutagenesis to Ala. Genbank accession numbers for the proteins are 
as indicated in Table 8, in the Example below, along with the SEQ ID NOs. for the 
bromodomain sequences. 

5 Figures 2A-2H depict the structure of the P/CAF bromodomain. Figures 2A-2B 
shows the stereoview of the trace of 30 superimposed NMR-derived structures of 
the bromodomain (residues 722-830). The N-terminal four residues (SKEP) which 
are structurally disordered are omitted for clarity. For the final 30 structures, the 
root-mean-square deviations (RMSDs) of the backbone and all heavy atoms are 0.63 

10 ± 0.1 lA and 1.15 ± 0.12A for residues 723-830, respectively. The RMSDs of the 
backbone and all heavy atoms for the four a-helices (residues 727-743, 770-776, 
785-802, and 807-827), are 0.34 ± 0.04A and 0,87 ± 0.06A, respectively. Figures 2C- 
2D show the stereoview of the bromodomain structures from the bottom of the 
protein, which is rotated approximately 90° from the orientation in Figures 2A-2B. 

15 Figure 2E shows the Ribbons [Carson, M., J. Appl Crystallogr. 24:958-961 (1991)] 
depiction of the averaged minimized NMR structure of the P/CAF bromodomain. 
The orientation of Figure 2E is as shown in Figures 2A-2B. Figures 2F-2G are 
schematic representations of the overall topology of the up-and-down four-helix 
bundle folds with the opposite handedness. The left-handed fold is seen in 

20 bromodomain, cytochrome 65, and T4 lysozyme (left. Figure 2F), whereas the 

right-handed four-helix bundles are observed in proteins such as hemerythrin and 
cytochrome 65^2 (right, Figure 2G) [Richardson, J., Adv,Protein Chem,, 34:167-339 
(1989); Presnell and Cohen, Proc, Natl Acad, Set USA 86:6592-6596 (1989)]. 
Figure 2H is a molecular surface representation of the electrostatic potential (blue = 

25 positive; red = negative) of the bromodomain calculated in GRASP [Nicholls et al., 
Biophys. J. 64:166-170 (1993)]. The hydrophobic and aromatic residues (Tyr809, 
Tyr802, Tyr760, Ala757, and Val752) located between the ZA and BC loops are 
indicated. 

30 Figures 3A-3C show the binding of the P/CAF bromodomain to AcK. Figure 3A 
shows the superimposed region of the 2D *^N-HSQC spectra of the bromodomain 
(approximately 0.5 mM) in its free form (red) and complexed to the AcK-containing 
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H4 peptide (molar ratio 1 :6) (black). Figure 3B is the Ribbon and dotted-surface 
diagram of the bromodomain depicting the location of the lysine-acetylated H4 
peptide binding site. The color coding reflects the chemical shift changes (AS) of the 
backbone amide *H and ^^N resonances upon binding to the AcK peptide as observed 
5 in the ^^N-HSQC spectra. The normalized weighted average of the chemical shift 
changes was calculated by ziyj^^= [A(f^ + J^p^/25)/2]^^V^^^, where is the 
maximum weighted chemical shift difference observed for Tyr809 (0.16ppm). The 
backbone atoms are color-coded in red, yellow, or green for residues that have 
4/^.;arOf >0.6 (Tyr809, Glu808, Asn803, and Ala757), 0.2-0.6 (AlaSlS, Tyr802, 
10 Tyr760, and Val752), or <0.2 (Cys812, Ser807, Cys799, Phe796, and Phe748), 
respectively. The non-perturbed residues are shown in blue. Figure 3C shows the 
chemical structures of acetyl-lysine, acetyl-histamine, and acetyl-histidine. 

Figure 4 depicts the acetyl-lysine binding pocket. This is the Ribbons [Carson, M., J. 
15 AppL Crystallogr. 24:958-961 (1991)] depiction of a portion of the P/CAF 

bromodomain complexed with the acetyl-histamine. The ligand is color-coded by 
atom type. 
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DETAILED DESCRIPTION OF THE INVENTION 



The present invention identifies a general binding partner (ligand) for the protein 
motif known as the bromodomain. Indeed, by combining structural and site-directed 
mutagenesis studies the present invention demonstrates that bromodomains can 
interact specifically with acetyl-lysine (AcK), making them the first protein modules 

25 known to exhibit such interactions. Like other modular domains, such as Src 

homology-2 (SH2) and phosphotyrosine binding (PTB) domains, which specifically 
interact with phosphotyrosine-containing proteins, the bromodomain/acetyl-lysine 
recognition provides a means to regulate protein-protein interactions via protein lysine 
acetylation. The nature of the acetyl-lysine recognition by the bromodomain is 

30 similar to that of histone acetyltransferase interaction with acetyl-CoA. The present 
invention therefore couples for the first time, the fixnctionality of the bromodomain 
with the HAT activity of coactivators in the regulation of gene transcription. 
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The present invention further provides both a nuclear magnetic resonance (NMR) 
structure of the bromodomain from the HAT coactivator P/CAF 
(p300/CBP-associated factor) as v^ell as the structure for the P/CAF bromodomain in 
complex with acetyl-histamine. The structure reveals an unusual left-handed 
up-and-down four-helix bundle. 

The results disclosed herein explain prior deletion experiments which showed that the 
bromodomain is indispensable for the function of GCN5 in yeast. 
Bromodomain-AcK binding also appears to be important for the assembly and activity 
of multiprotein complexes in transcriptional activation. The results reported herein 
therefore, form the foundation for identifying specific biological ligands and for 
defining the molecular mechanisms by which the extensive family of bromodomains 
participate in chromatin remodeling and transcriptional activation 

As disclosed herein, the binding partner for the bromodomain is a peptide or protein 
comprising an acetyl-lysine (AcK). Interestingly, whereas a free acetyl-lysine does not 
appear to bind the bromodomain, an analog of the acetyl-lysine, acetyl-histamine, 
does. This is most likely due to the additional charge present in the free amino acid. 
Consistently, free acetyl-histidine also does not to bind the bromodomain. 

The present invention further provides a key region of the bromodomain for the 
interaction with its acetyl-lysine binding partner, the ZA loop. The amino acid 
sequence of the ZA loop is defined in Figure 1 for a number of bromodomains and is 
depicted in Figure 2A for P/CAF. In a particular embodiment, the ZA loop has 
between about 21 and 40 amino acid residues comprising the amino acid sequence of : 

F X2.3 P X5.3 Jp/^ X Y Jy^^H X5 P D (SEQ ID NO:3) 

more preferably the ZA loop has about 23 to 34 amino acid residues and corhprises the 
amino acid sequence: 

X, F X2.3 P X5.3 Jp,k;h X Y Jv^,H X5 P Jm/i/v D (SEQ ID NO:43) 
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(1) The single letter amino acid code is used in this description, /.e.,"F" for 
phenylalanine; "P" for proline; "Y" for tyrosine; and "D" for aspartic acid. 

(2) "X" indicates any amino acid (an undesignated amino acid); and X, Xj, 
X2.3, X5,and X5.8 indicates one undesignated amino acid, two consecutive undesignated 

5 amino acids, two or three consecutive undesignated amino acids, five consecutive 
undesignated amino acids, and five to eight consecutive undesignated amino acids 
respectively. 

(3) "J" indicates that identity of the amino acid is restricted to a particular 
group, again the one letter code is used 

10 : (i) Jp/K/H is either proline, lysine or histidine. 

(ii) Jy/f/h is either tyrosine, phenylalanine or histidine. 

(iii) Jm/w is either methionine, isoleucine, or valine. 

Since this region of the bromodomain is important in binding its acetyl-lysine binding 
15 partner, antibodies specifically raised against this region are also included in the 

present invention. In a particular embodiment, the antibody is a humanized chimeric 
antibody that can be used in therapeutic treatment. Thus monoclonal, chimeric, and 
polyclonal antibodies raised against bromodomains, preferably against amino acid 
residues in the ZA loop region are part of the present invention. In a specific 
20 embodiment the antibody is raised against a peptide, fiision peptide or conjugated 
peptide consisting of amino acid residues 746 to 765 of SEQ ID NO:2, z.e., 
WPFMEPVKRTEAPGYYEVIR (SEQ ID NO:44). Such antibodies can be used in the 
treatment of leukemia for example. Alternatively, these antibodies can be used in drug 
discovery assays, 

25 

Thus the present invention provides the first detailed structural information regarding a 
bromodomain and a bromodomain complexed with its acetylated binding partner. The 
present invention therefore provides the three-dimensional structure of the 
bromodomain and a bromodomain acetylated binding partner complex. Since the 
30 interaction of the bromodomain with a histone for example, can play a significant role 
in chromatin remodeling/regulation, the structural information provided herein can be 
employed in methods of identifying drugs that can modulate basic cell processes by 
modulating the transcription. In a particular embodiment, the three-dimensional 
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structural information is used in the design of a small organic molecule for the 
treatment of cancer. 

Indeed, the bromodomain and lysine-acetylated protein interaction can now be 
5 implicated to play a causal role in the development of a number of diseases including 
cancers such as leukemia. For example, chromatin remodeling plays a central role in 
the etiology of viral infection and cancer [Archer and Hodin, Curr. Opin. Genet Biol. 
9:171-174 (1999); Jacobson and Pillus, Curr. Opin. Genet, Biol. 9:175-184 (1999)]. 
Both altered histone acetylation/deacetylation and aberrant forms of chromatin- 

10 remodeling complexes are associated with human diseases. Furthermore, 

chromosomal translocation of various cellular genes with those encoding HATs and 
subunits of chromatin remodeling complexes have been implicated in leukomogenesis. 
The MOZ (monocytic leukemia zinc finger) and MLL/ALL-1 genes are fi"equently fused 
to the gene encoding the co-activator HAT CBP [Sobulo et al, Proc. Natl Acad, Sci. 

15 USA 94:8732-8737(1997)]. The resulting fusion protein MLL-CBP contains the 

tandem bromodomain-PHD finger-HAT domain of CBP. It also has been shown that 
both the bromodomain and HAT domain of CBP are required for leukomogenesis, 
because deletion of either the bromodomain or the HAT domain results in loss of the 
MLL-CBP fusion protein's ability for cell transform. These results indicate that the 

20 CBP bromodomain, and more particularly, the ZA loop of the CBP bromodomain, is 
an excellent target for developing drugs that interfere with the bromodomain acetyl- 
lysine interaction that can be used in the treatment of human acute leukemia. In 
addition, an antibody {e.g., a humanized antibody) raised specifically against a peptide 
from the ZA loop of the CBP bromodomain could also be effective for treating these 

25 conditions. 

Furthermore, the human immunodeficiency virus type 1 (HIV-1) /raw5'-activator 
protein. Tat, is absolutely required for productive HIV viral replication [Jeang and 
Gatignol, Curr. Top, Microbiol. Immunol, 188:123-144(1994)]. Recently, it has been 
30 shown that HIV-1 Tat transcriptional activity is tightly regulated by lysine acetylation 
[Kieman et al, EMBO Journal 18:6106-61 18 (1999)]. Therefore, the interaction of 
the acetyl-lysine of Tat with one or more bromodomain-containing proteins associated 
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with chromatin remodeling could mediate gene transcription. Thus, the 
bromodomain/lysine-acetylated Tat interaction could also serve as a drug target for 
blocking HIV replication in cells. Similarly, an antibody raised specifically against a 
peptide from the ZA loop of the bromodomain could also be effective for treating these 
5 conditions. 

In addition, based on the new structural information disclosed herein, the key amino 
acid residues for the binding of a given bromodomain and its binding partner can be 
identified and fiirther elucidated using basic mutagenesis and standard isothermal 
10 titration calorimetry, for example. In this case, both the crucial amino acids for the 
bromodomain and the binding partner (i.e., apart from the acetyl-lysine) can be readily 
determined and are also part of the present invention. 

The results obtained from the structural and fiinctional studies disclosed herein provide 
1 5 the foundation for both high throughput drug screening and structure-based rational 
drug design. The agents identified by this procedure will be useful for ameliorating 
conditions involving chromatin remodeling/regulation as indicated above. 

Structure based rational drug design is the most efficient method of drug development. 

20 However, heretofore, no information has been disclosed regarding the structure of the 
bromodomain or more importantly, its interaction with the acetyl-lysine of its binding 
partner. Obtaining detailed structural information requires an extensive NMR or X-ray 
crystallographic analysis. By determining and then exploiting the detailed structural 
information of the bromodomain and of the bromodomain/acetyl-histamine 

25 (exemplified by NMR analysis below) the present invention provides novel methods 
for developing new dmgs through structure based rational drug design. 

Thus the present invention provides representative sets of the atomic structure 
coordinates of the free form of the P/CAF bromodomain (Table 5) and of the P/CAF 
30 bromodomain-acetyl-histamine complex (Table 6) which were both obtained by NMR 
analysis. A Ribbon diagram of the three-dimensional structure of the P/CAF 
bromodomain is depicted in Figure 2E, whereas the P/CAF bromodomain acetyl-lysine 
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binding pocket is depicted in Figure 4. The present invention also provides the NOE- 
derived distance restraints, and NMR chemical shift assignments of the P/CAF 
bromodomain. The NMR chemical shift assignments of the P/CAF bromodomain are 
included in the chemical shift table (Table 1) for the *H-*^N HSQC spectrum of P/CAF 
5 bromodomain. The unambiguous NOE-derived Inter-proton Distance Restraints 

(Table 2), the ambiguous NOE-derived Inter-proton Distance Restraints (Table 3) and 
the bonding restraints (Table 4) are also disclosed herein. The sample atomic 
coordinate data provided enable the skilled artisan to practice the invention. In 
addition. Tables 1-6 are also capable of being placed into a computer readable form 
10 which is also part of the present invention. Furthermore, methods of using these 
coordinates and chemical shifts and related information (including in computer 
readable forms) either individually or together in drug assays are also provided. More 
particularly, such atomic coordinates can be used to identify potential ligands or drugs 
which will modulate the binding of a bromodomain with its binding partner. 

15 

Therefore, if appearing herein, the following terms shall have the definitions set out 
below. 

As used herein a "bromodomain-acetyl-lysine binding complex" is a binding complex 
20 between a bromodomain or fi-agment thereof and either a peptide/polypeptide 

comprising an acetyl-lysine (or an analog of acetyl-lysine), or a free analog of acetyl- 
lysine, such as acetyl-histamine disclosed in the Example below. Preferably, the 
peptide comprises at least six amino acids in addition to the acetyl-lysine. The 
dissociation constant of a bromodomain-acetyl-lysine binding complex is dependent 
25 on whether the lysine residue or analog thereof is acetylated or not, such that the 
affinity for the bromodomain and the peptide comprising the lysine residue (for 
example) significantly decreases when that lysine residue is not acetylated. 

As used herein a "ZA loop" of a bromodomain is one protion of a bromodomain that is 
30 involved in the binding of the bromodomain to the acetyl-lysine. The structure of the 
ZA loop of the bromodomain of for P/CAF is depicted in Figure 2A. The ZA loop has 
between about 20 and 40 amino acids and comprises the amino acid sequence of SEQ 
ID N0:3. More preferably the ZA loop comprises between about 23 to 34 amino acids 
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and has the amino acid sequence SEQ ED NO:43. The amino acid sequence of the ZA 
loop for a representative number of individual bromodomains is shown in Figure 1, 

A "polypeptide" or "peptide" comprising a fragment of a bromodomain, such as the 
5 ZA loop, or a peptide or polypeptide comprising an acetyl-lysine, as used herein can be 
the "fragment" alone, or a larger chimeric or fusion peptide/protein which contains the 
"fragment". 

As used herein the terms "fusion protein" and "fusion peptide" are used 
10 interchangeably and encompass "chimeric proteins and/or chimeric peptides" and 
fusion "intein proteins/peptides", A fusion protein comprises at least a portion of a 
protein or peptide of the present invention, e.g., a bromodomain, joined via a peptide 
bond to at least a portion of another protein or peptide including e,g,, a second 
bromodomain in a chimeric fusion protein. In a particular embodiment the portion of 
1 5 the bromodomain is antigenic. Fusion proteins can comprise a marker protein or 
peptide, or a protein or peptide that aids in the isolation and/or purification of the 
protein, for example. 

As used herein, and unless otherwise specified, the terms "agent", "potential drug", 
20 "compound", "test compound" or "potential compound" are used interchangeably, and 
refer to chemicals which potentially have a use as an inhibitor or activator/stabilizer of 
bromodomain-acetyl-lysine binding. Therefore, such "agents", "potential drugs", 
"compounds" and "potential compounds" may be used, as described herein, in drug 
assays and drug screens and the like. 

25 

As used herein a "small organic molecule" is an organic compound, including a 
peptide [or organic compound complexed with an inorganic compound (e,g., metal)] 
that has a molecular weight of less than 3 Kilodaltons. Such small organic molecules 
can be included as agents, etc. as defined above. 

30 

As used herein the term "binds to" is meant to include all such specific interactions that 
result in two or more molecules showing a preference for one another relative to some 
third molecule. This includes processes such as covalent, ionic, hydrophobic and 
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hydrogen bonding but does not include non-specific associations such as solvent 
preferences. 

As used herein the term "about" signifies that a value is within twenty percent of the 
5 indicated value ue,^ a peptide containing "about" 20 amino acid residues can contain 
between 16 and 24 amino acid residues. 

General Techniques for Constructing Nucleic Acids That Encode the Bromodomains 
and Fragments Thereof Tlncuding. ZA Loops): and the Bromodomain Binding 
10 Partners of the Present Invention. 

In accordance with the present invention there may be employed conventional 
molecular biology, microbiology, and recombinant DNA techniques within the skill of 
the art. Such techniques are explained fully in the literature. See, e,g., Sambrook, 

15 Fritsch & Maniatis, Molecular Cloning: A Laboratory Manual, Second Edition (1989) 
Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York (herein 
"Sambrook et al., 1989"); DNA Cloning: A Practical Approach, Volumes I and II 
(D.N. Glover ed. 1985); Oligonucleotide Synthesis (M.J. Gait ed. 1984); Nucleic Acid 
Hybridization [B.D. Hames & S.J. Higgins eds. (1985)]; Transcription And 

20 Translation [B.D. Hames & S.J. Higgins, eds. (1984)]; Animal Cell Culture [R.I. 

Freshney, ed, (1986)]; Immobilized Cells And Enzymes [IRL Press, (1986)]; B. Perbal, 
A Practical Guide To Molecular Cloning (1984); F.M. Ausubel et al, (eds.). Current 
Protocols in Molecular Biology, John Wiley & Sons, Inc. (1994). 

25 Therefore, if appearing herein, the following terms shall have the definitions set out 
below. 

As used herein, the term "gene" refers to an assembly of nucleotides that encode a 
polypeptide, and includes cDNA and genomic DNA nucleic acids. 

30 

A "vector" is a replicon, such as plasmid, phage or cosmid, to which another DNA 
segment may be attached so as to bring about the replication of the attached segment. 
A "replicon" is any genetic element (e.g., plasmid, chromosome, virus) that functions 
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as an autonomous unit of DNA replication in vivo, capable of replication under its 
own control. 

A "cassette" refers to a segment of DNA that can be inserted into a vector at specific 
5 restriction sites. The segment of DNA encodes a polypeptide of interest, and the 
cassette and restriction sites are designed to ensure insertion of the cassette in the 
proper reading frame for transcription and translation. 

A cell has been "transfected" by exogenous or heterologous DNA when such DNA has 
10 been introduced inside the cell. 

A "nucleic acid molecule" refers to the phosphate ester polymeric form of 
ribonucleosides (adenosine, guanosine, uridine or cytidine; "RNA molecules") or 
deoxyribonucleosides (deoxyadenosine, deoxyguanosine, deoxythymidine, or 

15 deoxycytidine; "DNA molecules"), or any phosphoester analogues thereof, such as 
phosphorothioates and thioesters, in either single stranded form, or a double-stranded 
helix. Double stranded DNA-DNA, DNA-RNA and RNA-RNA heUces are possible. 
The term nucleic acid molecule, and in particular DNA or RNA molecule, refers only 
to the primary and secondary structure of the molecule, and does not limit it to any 

20 particular tertiary forms. Thus, this term includes double-stranded DNA found, inter 
alia, in linear or circular DNA molecules (e.g., restriction fragments), plasmids, and 
chromosomes. In discussing the structure of particular double-stranded DNA 
molecules, sequences may be described herein according to the normal convention of 
giving only the sequence in the 5' to 3' direction along the nontranscribed strand of 

25 DNA (i.e., the strand having a sequence homologous to the mRNA). A "recombinant 
DNA molecule" is a DNA molecule that has undergone a molecular biological 
manipulation. 

A nucleic acid molecule is "hybridizable" to another nucleic acid molecule, such as a 
30 cDNA, genomic DNA, or RNA, when a single stranded form of the nucleic acid 
molecule can anneal to the other nucleic acid molecule under the appropriate 
conditions of temperature and solution ionic strength (see Sambrook et al, supra). 
The conditions of temperature and ionic strength determine the "stringency" of the 
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hybridization. For preliminary screening for homologous nucleic acids, low stringency 
hybridization conditions, corresponding to a of 55°, can be used, e.g., 5x SSC, 
0. 1% SDS, 0.25% milk, and no formamide; or 30% formamide, 5x SSC, 0.5% SDS). 
Moderate stringency hybridization conditions correspond to a higher T^,, e.g., 40% 
5 formamide, with 5x or 6x SCC. High stringency hybridization conditions correspond 
to the highest T^, e.g., 50% formamide, 5x or 6x SCC. Hybridization requires that the 
two nucleic acids contain complementary sequences, although depending on the 
stringency of the hybridization, mismatches between bases are possible. The 
appropriate stringency for hybridizing nucleic acids depends on the length of the 

10 nucleic acids and the degree of complementation, variables well known in the art. The 
greater the degree of similarity or homology between two nucleotide sequences, the 
greater the value of T^ for hybrids of nucleic acids having those sequences. The 
relative stability (corresponding to higher T^) of nucleic acid hybridizations decreases 
in the following order: RNArRNA, DNA:RNA, DNA:DNA. For hybrids of greater 

15 than 100 nucleotides in length, equations for calculating T^ have been derived (see 
Sambrook et aL, supra, 9.50-10.51). For hybridization with shorter nucleic acids, i.e., 
oligonucleotides, the position of mismatches becomes more important, and the length 
of the oligonucleotide determines its specificity {see Sambrook et al, supra, 1 1.7- 
1 1.8). Preferably a minimum length for a hybridizable nucleic acid is at least about 12 

20 nucleotides; preferably at least about 18 nucleotides; and more preferably the length is 
at least about 27 nucleotides; and most preferably 36 nucleotides. 

In a specific embodiment, the term "standard hybridization conditions" refers to a T^, 
of 55°C, and utilizes conditions as set forth above. In a preferred embodiment, the 
25 is 60° C; in a more preferred embodiment, the T^^^ is 65 °C. 

A DNA "coding sequence" is a double-stranded DNA sequence which is transcribed 
and translated into a polypeptide in a cell in vitro or in vivo when placed under the 
control of appropriate regulatory sequences. The boundaries of the coding sequence 
30 are determined by a start codon at the 5' (amino) terminus and a translation stop codon 
at the 3' (carboxyl) terminus. A coding sequence can include, but is not limited to, 
prokaryotic sequences and synthetic DNA sequences. If the coding sequence is 
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intended for expression in a eukaryotic cell, a polyadenylation signal and transcription 
termination sequence will usually be located 3 ' to the coding sequence. 

Transcriptional and translational control sequences are DNA regulatory sequences, 
5 such as promoters, enhancers, terminators, and the like, that provide for the expression 
of a coding sequence in a host cell. In eukaryotic cells, polyadenylation signals are 
control sequences. 

A "promoter sequence" is a DNA regulatory region capable of binding RNA 
10 polymerase in a cell and initiating transcription of a downstream (3' direction) coding 
sequence. For purposes of defining the present invention, the promoter sequence is 
bounded at its 3 ' terminus by the transcription initiation site and extends upstream (5 ' 
direction) to include the minimum number of bases or elements necessary to initiate 
transcription at levels detectable above background. Within the promoter sequence 
15 will be found a transcription initiation site (conveniently defined for example, by 

mapping with nuclease SI), as well as protein binding domains (consensus sequences) 
responsible for the binding of RNA polymerase. 

A coding sequence is "under the control" of transcriptional and translational control 
20 sequences in a cell when RNA polymerase transcribes the coding sequence into 

mRNA, which is then trans-RNA spliced and translated into the protein encoded by the 
coding sequence. 

A DNA sequence is "operatively linked" to an expression control sequence when the 
25 expression control sequence controls and regulates the transcription and translation of 
that DNA sequence. The term "operatively linked" includes having an appropriate 
start signal (e.g., ATG) in fi-ont of the DNA sequence to be expressed and maintaining 
the correct reading frame to permit expression of the DNA sequence under the control 
of the expression control sequence and production of the desired product encoded by 
30 the DNA sequence. If a gene that one desires to insert into a recombinant DNA 

molecule does not contain an appropriate start signal, such a start signal can be inserted 
in front of the gene. 
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As used herein, the term "homologous" in all its grammatical forms refers to the 
relationship between proteins that possess a "common evolutionary origin," including 
proteins from superfamilies (e,g., the immunoglobulin superfamily) and homologous 
proteins from different species (e.g., myosin light chain, etc.) [Reeck et aL, Cell, 
5 50:667 (1987)]. Such proteins have sequence homology as reflected by their high 
degree of sequence similarity. 

Accordingly, the term "sequence similarity" in all its grammatical forms refers to the 
degree of identity or correspondence between nucleic acid or amino acid sequences of 
10 proteins that may or may not share a common evolutionary origin (see Reeck et al, 
supra). However, in common usage and in the instant application, the term 
"homologous," when modified with an adverb such as "highly," may refer to sequence 
similarity and not a common evolutionary origin, 

15 Two DNA sequences are "substantially homologous" when at least about 60% 

(preferably at least about 80%, and most preferably at least about 90 or 95%) of the 
nucleotides match over the defined length of the DNA sequences. Sequences that are 
substantially homologous can be identified by comparing the sequences using standard 
software available in sequence data banks, or in a Southern hybridization experiment 

20 under, for example, stringent conditions as defined for that particular system. Defining 
appropriate hybridization conditions is within the skill of the art. See, e.g., Maniatis et 
al. , supra\ DNA Cloning, Vols. I & II, supra\ Nucleic Acid Hybridization, supra. 

As used herein an amino acid sequence is 1 00% "homologous" to a second amino acid 
25 sequence if the two amino acid sequences are identical, and/or differ only by neutral or 
conservative substitutions as defined below. Accordingly, an amino acid sequence is 
50% "homologous" to a second amino acid sequence if 50% of the two amino acid 
sequences are identical, and/or differ only by neutral or conservative substitutions. 

30 As used herein, DNA and protein sequence percent identity can be determined using 
Mac Vector 6.0,1, Oxford Molecular Group PLC (1996) and the Clustal W algorithm 
with the alignment default parameters, and default parameters for identity. These 
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commercially available programs can also be used to determine sequence similarity 
using the same or analogous default parameters. 

The term "corresponding to" is used herein to refer similar or homologous sequences, 
5 whether the exact position is identical or different from the molecule to which the 
similarity or homology is measured. Thus, the term "corresponding to" refers to the 
sequence similarity, and not the numbering of the amino acid residues or nucleotide 
bases. 

10 As used herein a "heterologous nucleotide sequence" is a nucleotide sequence that is 
added to a nucleotide sequence of the present invention by recombinant methods to 
form a nucleic acid which is not naturally formed in nature. Such nucleic acids can 
encode fusion proteins or peptides, including chimeric proteins and peptides. Thus the 
heterologous nucleotide sequence can encode peptides and/or proteins which contain 

15 regulatory and/or structural properties. In another such embodiment the heterologous 
nucleotide can encode a protein or peptide that functions as a means of detecting the 
protein or peptide encoded by the nucleotide sequence of the present invention after the 
recombinant nucleic acid is expressed. In still another such embodiment the 
heterologous nucleotide can function as a means of detecting a nucleotide sequence of 

20 the present invention. A heterologous nucleotide sequence can comprise non-coding 
sequences including restriction sites, regulatory sites, promoters and the like. 

The present invention also relates to cloning vectors containing nucleic acids encoding 
analogs and derivatives of the bromodomains of the present invention and 
25 polypeptides/peptides that can bind a bromodomain when a lysine of the 

polypeptide/peptide is acetylated, including modified fragments, that have the same 
or homologous functional activity as the individual fragments, and homologs thereof. 
The production and use of derivatives and analogs related to the fragments are within 
the scope of the present invention. 

30 

Due to the degeneracy of nucleotide coding sequences, other DNA sequences which 
encode substantially the same amino acid sequence as a nucleic acid encoding a protein 
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comprising bromodomain or bromodomain binding partner (i.e., when post- 
transcriptionally acetylated) of the present invention for example, may be used in the 
practice of the present invention. These include but are not limited to allehc genes, 
homologous genes from other species, which are altered by the substitution of different 
5 codons that encode the same amino acid residue within the sequence, thus producing a 
silent change. Likewise, the peptides and polypeptides of the present invention 
include, but are not limited to, those containing, as a primary amino acid sequence, 
analogous portions of their respective amino acid sequences including altered 
sequences in which functionally equivalent amino acid residues are substituted for 

10 residues within the sequence resulting in a conservative amino acid substitution. For 
example, one or more amino acid residues within the sequence can be substituted by 
another amino acid of a similar polarity, which acts as a functional equivalent, 
resulting in a silent alteration. Substitutes for an amino acid within the sequence may 
be selected from other members of the class to which the amino acid belongs. For 

15 example, the nonpolar (hydrophobic) amino acids include alanine, leucine, isoleucine, 
valine, proline, phenylalanine, tryptophan and methionine. Amino acids containing 
aromatic ring structures are phenylalanine, tryptophan, and tyrosine. The polar neutral 
amino acids include glycine, serine, threonine, cysteine, tyrosine, asparagine, and 
glutamine. The positively charged (basic) amino acids include arginine, and lysine. 

20 The negatively charged (acidic) amino acids include aspartic acid and glutamic acid. 

Particularly preferred conserved amino acid exchanges are: 

(a) Lys for Arg or vice versa such that a positive charge may be maintained; 

(b) Glu for Asp or vice versa such that a negative charge may be maintained; 
25 (c) Ser for Thr or vice versa such that a free -OH can be maintained; 

(d) Gin for Asn or vice versa such that a free NHj can be maintained; 

(e) He for Leu or for Val or vice versa as roughly equivalent hydrophobic amino acids; 
and 

(f) Phe for Tyr or vice versa as roughly equivalent aromatic amino acids. 

30 

A conservative change generally leads to less change in the structure and fimction of 
the resulting protein. A non-conservative change is more likely to alter the structure, 
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activity or function of the resulting protein. The present invention should be 
considered to include sequences containing conservative changes which do not 
significantly alter the activity or binding characteristics of the resulting protein. 
Specific amino acid residues for the P/CAF bromodomain have been identified that are 
important for binding, indicating a potential lower stringency for the substitution of the 
remaining amino acids residues. 

All of the peptides/fragments of the present invention can be modified by being placed 
in a fusion or chimeric peptide or protein, or labeled e,g,, to have an N-terminal FLAG- 
tag, or H6 tag. In a particular embodiment the P/CAF bromodomain fragment can be 
modified to contain a marker protein such as green fluorescent protein as described in 
U.S. Patent No. 5,625,048 filed April 29, 1997 and WO 97/26333, published July 24, 
1997 each of which are hereby incorporated by reference herein in their entireties. 

The nucleic acids encoding peptides and protein firagments of the present invention and 
analogs thereof can be produced by various methods known in the art. The 
manipulations which result in their production can occur at the gene or protein level 
[Sambrook et al, 1989, supra]. The nucleotide sequence can be cleaved at appropriate 
sites with restriction endonuclease(s), followed by further enzymatic modification if 
desired, isolated, and ligated in vitro. In addition a nucleic acid sequence can be 
mutated in vitro or in vivo, to create and/or destroy translation, initiation, and/or 
termination sequences, or to create variations in coding regions and/or form new 
restriction endonuclease sites or destroy preexisting ones, to facilitate further in vitro 
modification. Any technique for mutagenesis known in the art can be used, including 
but not limited to, in vitro site-directed mutagenesis [Hutchinson et at,, J. Biol Chem,, 
253:6551 (1978); Zoller and Smith, DNA^ 3:479-488 (1984); Oliphant et al. Gene, 
44:177 (1986); Hutchinson et aL, Proc. NatL Acad, Sci. U.SA,, 83:710 (1986)], use of 
TAB® hnkers (Pharmacia), etc, PGR techniques are preferred for site directed 
mutagenesis [see Higuchi, 1989, "Using PGR to Engineer DNA", in PGR Technology: 
Principles and Applications for DNA Amplification, H. Erlich, ed., Stockton Press, 
Ghapter 6, pp. 61-70]. 



! 
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The identified and isolated nucleic acids can then be inserted into an appropriate 
cloning vector. A large number of vector-host systems known in the art may be used. 

Protein expression and purification 

5 

A bacterial protein expression system can be used to make various stable isotopically 
labeled ('^C, *^N, and ^H) protein samples that are useful for a three-dimensional NMR 
structural determination of a protein complex. For example a pET14b (Novagen) 
bacterial expression vector can be constructed which expresses the recombinant P/CAF 
10 bromodomain as an amino-terminal His-tagged fusion protein. 

Protein expression and purification can be conducted using standard procedures for 
His-tagged proteins [Zhou et aL, J. Biol Chem. 270:31 1 19-31 123 (1995)]. To 
optimize the level of protein expression, various bacterial growth and expression 

15 conditions can be screened, which include different Coli cell lines, and growth and 
protein induction temperatures. Generally, it is preferred to obtain the maximum 
amount of soluble protein while still inducing protein expression with a relatively low 
IPTG concentration e.g., -0.2mM (final concentration) at 16 °C. As exemplified 
below, the bromodomain of P/CAF (residues 719-832 of SEQ ID NO:2 which is SEQ 

20 ID NO: 7) was subcloned into the pET14b expression vector (Novagen) and expressed 
in Escherichia coli BL21(DE3) cells. Uniformly ^^N- and ^^N/^^C-labeled proteins 
were prepared by growing bacteria in a minimal medium containing *^NH4C1 with or 
without ^■'Cg-glucose. A uniformly *^N/^^C-labeled and fractionally deuterated protein 
sample was prepared by growing the cells in 75% ^H20. The bromodomain was 

25 purified by affinity chromatography on a nickel-IDA column (Invitrogen) followed by 
the removal of poly-His tag by thrombin cleavage. The final purification of the protein 
was achieved by size-exclusion chromatography. The acetyl-lysine-containing 
peptides were prepared on a MilliGen 9050 peptide synthesizer (Perkin Elmer) using 
Fmoc/HBTU chemistry. Acetyl-lysine was incorporated using the reagent 

30 Fmoc-Ac-Lys with HBTU/DEPEA activation. NMR samples contained approximately 
1 mM protein in lOOmM phosphate buffer of pH 6.5 and 5mM perdeuterated DTT and 
0.5mM EDTA in H20/'H20 (9/1) or 'HjO. 
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One major advantage of using the heteronuclear multidimensional approach, as 
exemplied herein, is that the NMR resonance assignments of a protein are obtained in a 
sequence-specific manner which assures accuracy and greatly facilitates data analysis 
and structure determination [Clore, G. M. & Gronenbom, A. M. Meth. Enzymol 
5 239:249-363 (1994)]. In addition, the signal overlapping problems in the protein 
spectra are minimized by the use of multidimensional NMR spectra, which separates 
the proton signals according to the chemical shifts of their attached hetero-nuclei (such 
as ^^N and *^C). This NMR approach has been proven very powerful for structural 
analysis of large proteins [Clore, G. M. & Gronenbom, A. M. Meth. EnzymoL 

10 239:249-363 (1994)]. To facilitate sequence-specific resonance assignments for the 
structural study, a uniformly *^C, *^N-labeled and fractionally (75%) deuterated protein 
sample of the bromodomain can be prepared by growing bacterial cells in 75% ^lA-fl as 
exemplified below. Such protein samples can be used for triple-resonance NMR 
experiments. A triple-labeled protein sample is useful for high-resolution NMR 

15 structural studies. Because of the favorable *H, ^''C, and *^N relaxation rates caused by 
the partial deuteration of the protein, constant-time triple-resonance NMR spectra can 
be acquired with higher digital resolution and sensitivity [Sattler, M. & Fesik, S. W. 
Structure 4:1245-1249 (1996)]. In addition, various stable-isotopically labeled (^^N 
and /^^N) proteins can also be prepared using this procedure. 

20 

Synthetic Polypeptides 

The term "polypeptide" is used in its broadest sense to refer to a compound of two or 
more subunit amino acids, amino acid analogs, or peptidomimetics. The subunits are 

25 linked by peptide bonds. The terms "polypeptide", "protein", and "peptide" are used 
interchangeably herein, though preferably as used herein a "peptide" refers to a 
compound of at least two but less than fifty subunit amino acids, and a polypeptide or 
protein refers to compound of fifty or more amino acids. The polypeptides of the 
present invention may be chemically synthesized or as detailed above, genetically 

30 engineered or isolated from natural sources. 

In addition, potential drugs or agents that may be tested in the drug screening assays of 
the present invention may also be chemically synthesized. When the peptide is to be 
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modified, e.g., acetylated, the modification can be at any time during the peptide 
synthesis, including using an acetyl-lysine as a starting material or acetylating a lysine 
residue of a peptide after the peptide has been synthesized. In the Example below, the 
acetyl-lysine-containing peptides were prepared on a MilUGen 9050 peptide 
5 synthesizer (Perkin Ehner) using Fmoc/HBTU chemistry. Acetyl-lysine was 
incorporated using the reagent Fmoc-Ac-Lys with HBTU/DIPEA activation. 

Thus, synthetic polypeptides, prepared using the well known techniques of soUd phase, 
liquid phase, or peptide condensation techniques, or any combination thereof, can 
10 include natural and unnatural amino acids. Amino acids used for peptide synthesis 
may be standard Boc (N"-amino protected N"-t-butyloxycarbonyl) amino acid resin 
with the standard deprotecting, neutralization, coupling and wash protocols of the 
original solid phase procedure of Merrifield [J. Am. Chem, Soc, 85:2149-2154 
(1963)], or the base-labile N^'-amino protected 9-fluorenyhnethoxycarbonyl (Fmoc) 
15 amino acids first described by Carpino and Han [J. Org. Chem., 37:3403-3409 (1972)]. 
Both Fmoc and Boc N"-amino protected amino acids can be obtained from Fluka, 
Bachem, Advanced Chemtech, Sigma, Cambridge Research Biochemical, Bachem, or 
Peninsula Labs or other chemical companies familiar to those who practice this art. In 
addition, the method of the invention can be used with other N"-protecting groups that 
are familiar to those skilled in this art. Solid phase peptide synthesis may be 
accomplished by techniques familiar to those in the art and provided, for example, in 
Stewart and Young [SoUd Phase Synthesis, Second Edition, Pierce Chemical Co., 
Rockford, IL (1984)] and Fields and Noble [Int. J. Pept. Protein Res,, 35:161-214 
(1990)], or using automated synthesizers, such as sold by ABS. Thus, polypeptides of 
the invention may comprise D-amino acids, a combination of D- and L-amino acids, 
and various "designer" amino acids {e.g., p-methyl amino acids, Ca-methyl amino 
acids, and Na-methyl amino acids, etc.) to convey special properties. Synthetic amino 
acids include ornithine for lysine, fluorophenylalanine for phenylalanine, and 
norleucine for leucine or isoleucine. Additionally, by assigning specific amino acids at 
specific coupling steps, a-helices, P turns, p sheets, y-tums, and cyclic peptides can be 
generated. 
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In a further embodiment, subunits of peptides that confer useful chemical and 
structural properties will be chosen. For example, peptides comprising D-amino acids 
will be resistant to L-amino acid-specific proteases in vivo. In addition, the present 
invention envisions preparing peptides that have more well defined structural 
5 properties, and the use of peptidomimetics, and peptidomimetic bonds, such as ester 
bonds, to prepare peptides with novel properties. In another embodiment, a peptide 
may be generated that incorporates a reduced peptide bond, i.e., R1-CH2-NH-R2, where 
Ri and R2 are amino acid residues or sequences. A reduced peptide bond may be 
introduced as a dipeptide subunit. Such a molecule would be resistant to peptide bond 

10 hydrolysis, e.g,, protease activity. Such peptides would provide ligands with unique 
function and activity, such as extended half-hves in vivo due to resistance to metabolic 
breakdown, or protease activity. Furthermore, it is well known that in certain systems 
constrained peptides show enhanced functional activity [Hruby, Life Sciences, 31:189- 
199 (1982); Hruby et aL, Biochem J„ 268:249-262 (1990)]; the present invention 

15 provides a method to produce a constrained peptide that incorporates random 
sequences at all other positions. 

Constrained and cyclic peptides, A constrained, cyclic or rigidized peptide may be 
prepared synthetically, provided that in at least two positions in the sequence of the 

20 peptide an amino acid or amino acid analog is inserted that provides a chemical 

functional group capable of crossUnking to constrain, cyclise or rigidize the peptide 
after treatment to form the crosslink. Cyclization will be favored when a turn-inducing 
amino acid is incorporated. Examples of amino acids capable of crosslinking a peptide 
are cysteine to form disulfides, aspartic acid to form a lactone or a lactam, and a 

25 chelator such as y-carboxyl-glutamic acid (Gla) (Bachem) to chelate a transition metal 
and form a cross-link. Protected y-carboxyl glutamic acid may be prepared by 
modifying the synthesis described by Zee-Cheng and Olson [Biophys. Biochem, Res, 
Commun,, 94:1128-1132 (1980)]. A peptide in which the peptide sequence comprises 
at least two amino acids capable of crosslinking may be treated, e.g., by oxidation of 
30 cysteine residues to form a disulfide or addition of a metal ion to form a chelate, so as 
to crosslink the peptide and form a constrained, cyclic or rigidized peptide. 
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The present invention provides strategies to systematically prepare cross-links. For 
example, if four cysteine residues are incorporated in the peptide sequence, different 
protecting groups may be used (Hiskey, in The Peptides: Analysis, Synthesis, Biology, 
Vol. 3, Gross and Meienhofer, eds.. Academic Press: New York, pp. 137-167 (1981); 
5 Ponsanti et al.. Tetrahedron, 46:8255-8266 (1990)]. The first pair of cysteines may be 
deprotected and oxidized, then the second set may be deprotected and oxidized. In this 
way a defined set of disulfide cross-links may be formed. Alternatively, a pair of 
cysteines and a pair of chelating amino acid analogs may be incorporated so that the 
cross-links are of a different chemical nature. 

10 

Non-classical amino acids that induce conformational constraints. The following non- 
classical amino acids may be incorporated in the peptide in order to introduce 
particular conformational motifs: l,2,3,4-tetrahydroisoquinoline-3-carboxylate 
[Kazmierski etal.,J. Am. Chem. Soc. 113:2275-2283 (1991)]; (2S,3S)-methyl- 
15 phenylalanine, (2S,3R)-methyl-phenylalanine, (2R,3S)-methyl-phenylalanine and 
(2R,3R)-methyl-phenylalanine (Kazmierski and Hruby, Tetrahedron Lett. (1991)]; 2- 
aminotetrahydronaphthalene-2-carboxylic acid [Landis, Ph.D. Thesis, University of 
Arizona (1989)]; hydroxy- l,2,3,4-tetrahydroisoquinoline-3-carboxylate [Miyake et al., 
J. Takeda Res. Labs.. 43:53-76 (1989)]; p-carboline (D and L) [Kazmierski, Ph.D. 
20 Thesis, University of Arizona ( 1 988)] ; HIC (histidine isoquinoline carboxylic acid) 
[Zechel et al.,Int. J. Pep. Protein Res.. 43 (1991)]; and HIC (histidine cyclic urea) 
(Dharanipragada) . 

The following amino acid analogs and peptidomiinetics may be incorporated into a 
25 peptide to induce or favor specific secondary structures: LL-Acp (LL-3-amino- 

2-propenidone-6-carboxylic acid), a P-tum inducing dipeptide analog [Kemp et al.,J. 
Org. Chem., 50:5834-5838 (1985)]; P-sheet inducing analogs [Kemp et al. 
Tetrahedron Lett., 29:5081-5082 (1988); p-tum inducing analogs [Kemp et al. 
Tetrahedron Lett., 29:5057-5060 (1988)]; «-helix inducing analogs (Kemp et al, 
30 Tetrahedron Lett., 29:4935-4938 (1988)]; y-tum inducing analogs [Kemp et al, J. 
Org. Chem., 54:109:1 15 (1989)]; and analogs provided by the following references: 
Nagai and Sato, Tetrahedron Lett.. 26:647-650 (1985); DiMaio et al, J. Chem. Soc. 
Perkin Trans., p. 1687 (1989); also a Gly-Ala turn analog [Kahn et al. Tetrahedron 
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spectra of mutated proteins can be compared to that of the wild-type protein 
bromodomain. 

Chemical-shift perturbations due to ligand binding have proven to be a reliable and 
sensitive probe for the ligand binding site of the protein. This is because the chemical- 
shift changes of the backbone amide groups are likely to reflect any changes in protein 
conformation and/or hydrogen bonding due to the peptide/ligand binding. To examine 
the effects of a mutation on the ligand binding (in this case the ligand is a peptide 
comprising an acetyl-lysine), peptide titration experiments can be conducted by 
following the changes of ^H/'^N signals of the mutant proteins as a fiinction of the 
peptide concentration. These experiments indicate whether the acetyl-lysine binding 
site remains the same or changes in the mutants relative to the wild type protein. The 
effects of the mutation on the peptide binding affinity can also be examined by NMR 
spectroscopy. If the mutated proteins result in the reduction of the binding affinity, a 
change of the exchange phenomenon between the free and the ligand-bound signals 
should be observed in NMR spectrum. If the reduction in binding affinity causes the 
peptide binding to change firom a slow exchange rate to a fast exchange rate, on the 
NMR time scale, then the peptide binding affinity can be determined from the NMR 
titration experiment. From these mutation analyses key amino acid residues that are 
important for binding a peptide comprising the acetyl-lysine can be identified. Such 
analysis has been exemplified below. 

Protein Structure Determination bv NMR Spectroscopy 

The NMR results from the present invention are summarized by the atomic structure 
coordinates of the free form of the P/CAF bromodomain (Table 5) and of the P/CAF 
bromodomain-acetyl-histamine complex (Table 6). The NMR chemical shift 
assignments of the P/CAF bromodomain are included in the chemical shift table (Table 
1) for the ^H-*^N HSQC spectrum of P/CAF bromodomain. The unambiguous NOE- 
derived Inter-proton Distance Restraints are in Table 2, the ambiguous NOE-derived 
Inter-proton Distance Restraints are in Table 3, and the *H bonding restraints are 
disclosed in Table 4. 
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Backbone and Side-chain Assignments: Sequence- specific backbone assignment can 
be achieved by using a suite of deuterium-decoupled triple-resonance 3D NMR 
experiments which include HNCA, HN(CO)CA, HN(CA)CB, HN(COCA)CB, HNCO, 
and HN(CA)CO experiments [Yamazaki, et al, J. Am, Chem. Soc. 116:11655-1 1666 
5 (1994)]. The water flip-back scheme is used in these NMR pulse programs to 
minimize amide signal attenuation from water exchange. Sequential side-chain 
assignments are typically accomplished from a series of 3D NMR experiments with 
alternative approaches to confirm the assignments. These experiments include 3D ^^N 
TOCSY-HSQC, HCCH-TOCSY, (H)C(CO)NH-TOCSY, and H(C)(CO)NH-TOCSY 
10 [see Clore, G. M. & Gronenbom, A. M. Meth, EnzymoL 239:249-363 (1994);Sattler et 
aL, Prog, in Nuclear Magnetic Resonance Spec, 4:93-158 (1999)]. 

Stereospecific Methyl Groups : Stereospecific assignments of methyl groups of Valine 
and Leucine residues can be obtained from an analysis of carbon signal multiplet 
15 splitting using a fractionally ^-^C-labeled protein sample, which can be readily prepared 
using M9 minimal medium containing 10% *^C-/90%*^C-glucose mixture [see Neri, et 
aL, Biochemistry 28:7510-7516 (1989)]. 

Dihedral Angle Restraints: Backbone dihedral angle (4>) constraints can be generated 
20 from the Vp^Ha coupling constants measured in a HNHA-J experiment [see Vuister, G. 
& Bax, A. y. Am. Chem, Soc. 115:7772-7777 (1993)]. Side-chain dihedral angles (xl) 
can be obtained from short mixing time ^^N-edited 3D TOCSY-HSQC [see Clore, et 
al.,J, BiomoL NMR 1:13-22 (1991)] and 3D HNHB experiments [see Matson et al, J, 
BiomoL NMR 3:239-244 (1993)], which can also provide stereospecific assignments of 
25 P methylene protons. 

Hydrogen Bonds Restraints: Amide protons that are involved in hydrogen bonds can 
be identified from an analysis of amide exchange rates measured from a series of 2D 
^H/*^N HSQC spectra recorded after adding ^H20 to the protein sample. 

30 

NOE Distance Restraints: Distance restraints are obtained from analysis of *^N, and 
^^C-edited 3D NOESY data, which can be collected with different mixing times to 
minimize spin diffusion problems. The nuclear Overhauser effect (NOE)-derived 
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restraints are categorized as strong (1.8-3 A), medium (1.8-4 A) or weak (1.8-5 A) 
based on the observed NOE intensities. A recently developed procedure for the 
iterative automated NOE analysis by using ARIA [see Nilges et aL, Prog, NMR 
Spectroscopy 32:107-139 (1998)] can be employed which integrates with X-PLOR for 
structural calculations. To ensure the success of ARIA/X-PLOR-assisted NOE analysis 
and structure calculations, the ARIA assigned NOE peaks can be manually confirmed. 

Intermolecular NOE Distance Restrains: For the structural determination of a 
protein/peptide complex, intermolecular NOE distance restraints can be obtained from 
a *'C-edited {F,) and *X and *^C-filtered (F3) 3D NOESY data set collected for a 
sample containing isotope-labeled protein and non-labeled peptide. 

Structure Calculations and Refinements: Structures of the protein can be generated 
using a distance geometry/simulated annealing protocol with the X-PLOR program 
[see Nilges,e/ a/., FEES Lett, 229:317-324 (1988); Kuszewski, et aL, J, BiolmoL NMR 
2:33-56 (1992); Brunger, A. T. X-PLOR Version 3,1: A system for X-Ray 
crystallography and NMR (Yale University Press, New Haven, CT, 1993)]. The 
structure calculations can employ inter-proton distance restraints obtained from ^^N- 
and ^•'C-resolved NOESY spectra. The initial low-resolution structures can be used to 
facilitate NOE assignments, and help identify hydrogen bonding partners for slowly 
exchanging amide protons. The experimental restraints of dihedral angles and 
hydrogen bonds can be included in the distance restraints for structure refinements. 

Protein-Structure Based Design of Agonists and Antagonists 
of the Bromodomain-Acetvl-Lvsine Binding Complex 

Once the three-dimensional structure of the Bromodomain and the Bromodomain- 
acetyl-lysine binding complex are determined, a potential drug or agent (antagonist or 
agonist) can be examined through the use of computer modeling using a docking 
program such as GRAM, DOCK, or AUTODOCK [Dunbrack et ai, 1997, supra]. 
This procedure can include computer fitting of potential agents to the bromodomain, 
for example, to ascertain how well the shape and the chemical structure of the potential 
ligand will complement or interfere with the interaction between the bromodomain and 
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the acetyl-lysine [Bugg et aL, Scientific American, Dec.:92-98 (1993); West etaL, 
TIPS, 16:67-74 (1995)]. Computer programs can also be employed to estimate the 
attraction, repulsion, and steric hindrance of the agent to the dimer-dimer binding site, 
for example. Generally the tighter the fit (e.g., the lower the steric hindrance, and/or 
the greater the attractive force) the more potent the potential drug will be since these 
properties are consistent with a tighter binding constant. Furthermore, the more 
specificity in the design of a potential drug the more likely that the drug will not 
interfere with related proteins. This will minimize potential side-effects due to 
unwanted interactions with other proteins. 

Initially a potential drug could be obtained by screening a random peptide library 
produced by recombinant bacteriophage for example, [Scott and Smith, Science, 
249:386-390 (1990); Cwirla et ai, Proc. Natl Acad. Sci., 87:6378-6382 (1990); 
Devlin et al. Science, 249:404-406 (1990)] or a chemical library. An agent selected in 
this manner could be then be systematically modified by computer modeling programs 
until one or more promising potential drugs are identified. Such analysis has been 
shown to be effective in the development of HIV protease inhibitors [Lam et al. 
Science 263:380-384 (1994); Wlodawer et aL, Ann. Rev. Biochem. 62:543-585 (1993); 
Appelt, Perspectives in Drug Discovery and Design 1:23-48 (1993); Erickson, 
Perspectives in Drug Discovery and Design 1:109-128 (1993)], 

Such computer modeling allows the selection of a finite number of rational chemical 
modifications, as opposed to the countless nimiber of essentially random chemical 
modifications that could be made, any one of which might lead to a usefiil drug. Each 
chemical modification requires additional chemical steps, which while being 
reasonable for the synthesis of a finite number of compounds, quickly becomes 
overwhelming if all possible modifications needed to be synthesized. Thus, through 
the use of the three-dimensional structural analysis disclosed herein and computer 
modeling, a large number of these compounds can be rapidly screened on the computer 
monitor screen, and a few likely candidates can be determined without the laborious 
synthesis of untold numbers of compounds. 
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Once a potential drug (agonist or antagonist) is identified it can be either selected from 
a library of chemicals as are commercially available from most large chemical 
companies including Merck, Glaxo Welcome, Bristol Meyers Squib, Monsanto/Searle, 
Eli Lilly, Novartis and Pharmacia UpJohn, or alternatively the potential drug may be 
5 synthesized (ie wo vo. As mentioned above, the wovo sjoithesis of one or even a 
relatively small group of specific compounds is reasonable in the art of drug design. 

The potential drug can then be tested in any standard binding assay (including in high 
throughput binding assays) for its ability to bind to the ZA loop of a bromodomain. 

10 Alternatively the potential drug can be tested for its ability to modulate the binding of a 
bromodomain to acetylated histamine, for example. When a suitable potential drug is 
identified, a second NMR structural analysis can optionally be performed on the 
binding complex formed between the bromodomain-acetyl-lysine binding complex, or 
the bromodomain alone and the potential drug. Computer programs that can be used to 

15 aid in solving such three-dimensional structures include QUANTA, CHARMM, 

INSIGHT, SYBYL, MACROMODE, and ICM, MOLMOL, RASMOL, AND GRASP 
[Kxauhs, y. Appl Crystallogr, 24:946-950 (1991)]. Most if not all of these programs 
and others as well can be also obtained from the Worldwide Web through the internet. 

20 Using the approach described herein and equipped with the structural analysis 

disclosed herein, the three-dimensional structures of other bromodomain-acetyl-lysine 
binding complexes can more readily be obtained and analyzed. Such £inalysis will, in 
turn, allow corresponding drug screening methodology to be performed using the 
three-dimensional stmctures of such related complexes. 

25 

For all of the dmg screening assays described herein further refinements to the 
structure of the drug will generally be necessary and can be made by the successive 
iterations of any and/or all of the steps provided by the particular drug screening assay, 
including further structural analysis by NMR, for example. 

30 

Phage libraries for Drug Screening, 

Phage libraries have been constructed which when infected into host E, coli produce 
random peptide sequences of approximately 10 to 15 amino acids [Parmley and Smith, 



37 

Gene 73:305-318 (1988), Scott and Smith, Science 249:386-249 (1990)]. Specifically, 
the phage library can be mixed in low dilutions with permissive E, coli in low melting 
point LB agar which is then poured on top of LB agar plates. After incubating the 
plates at 37° C for a period of time, small clear plaques in a lawn of coli will form 
which represents active phage growth and lysis of the E, coli, A representative of these 
phages can be absorbed to nylon filters by placing dry filters onto the agar plates. The 
filters can be marked for orientation, removed, and placed in washing solutions to 
block any remaining absorbent sites. The filters can then be placed in a solution 
containing, for example, a radioactive bromodomain. After a specified incubation 
period, the filters can be thoroughly washed and developed for autoradiography. 
Plaques containing the phage that bind to the radioactive bromodomain can then be 
identified. These phages can be fiarther cloned and then retested for their ability to 
bind to the bromodomain as before. Once the phage has been purified, the binding 
sequence contained within the phage can be determined by standard DNA sequencing 
techniques. Once the DNA sequence is known, synthetic peptides can be generated 
which are encoded by these sequences. These peptides can be tested, for example, for 
their ability to modulate the affinity of the bromodomain for its binding partner (eg,, a 
protein comprising an acetyl-lysine or a fi-agment of that protein). 

The effective peptide(s) can be synthesized in large quantities for use in in vivo models 
and eventually in humans to treat certain tumors. It should be emphasized that 
synthetic peptide production is relatively non-labor intensive, easily manufactured, 
quality controlled and thus, large quantities of the desired product can be produced 
quite cheaply. Similar combinations of mass produced synthetic peptides have been 
used with great success [Patarroyo, Vaccine, 10:175-178 (1990)]. 

Drug Screening Assays 

The drug screening assays of the present invention may use any of a number of means 
for determining the interaction between an agent or drug and a peptide comprising an 
acetyl-lysine and/or a bromodomain. Thus, standard high throughput drug screening 
procedures can be employed using a library of low molecular weight compounds, for 
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example that can be screened to identify a binding partner for the bromodoamin. Any 
such chemical library can be used including those discussed above. 

In a particular assay, a bromodomain is placed on or coated onto a solid support. 
Methods for placing the peptides or proteins on the solid support are well known in the 
art and include such things as linking bio tin to the protein and linking avidin to the 
solid support. An agent is allowed to equilibrate with the bromodomain to test for 
binding. Generally, the solid support is washed and agents that are retained are 
selected as potential drugs. Alternatively, a peptide comprising an acetyl- lysine is 
placed on or coated onto a solid support. In a particular embodiment of this type, the 
peptide comprises the amino acid sequence of SEQ ID NO:4. 

The agent may be labeled. For example, in one embodiment radiolabeled agents are 
used to measure the binding of the agent. In another embodiment the agents have 
fluorescent markers. In yet another embodiment, a Biocore chip (Pharmacia) coated 
with the bromodomain is used, for example and the change in surface conductivity can 
be measured. 

In addition, since a number of proteins have been identified that contain 
bromodomains, and the binding partners of many of these proteins are known, the fact 
that the bromodomain specifically binds to an acetylated lysine as disclosed herein 
allows the identification and preparation of a number of potential modulators of the 
bromodomain-acetyl-lysine binding complex based on the amino acid sequences of the 
binding partners to the proteins. Such potential modulators include : ISYGR-AcK- 
KRRQRR (SEQ ID NO:4), ARKSTGG-Ac/i:-APRKQL (SEQ ID NO:5) and 
QSTSRHK-^c/i:-LMFKTE (SEQ ID NO:6) which bind to the P/CAF bromodomain as 
shown in the Example, below. Such peptides also can be used, for example, as a 
starting point for the design of an inhibitor of the bromodomain-acetyl-lysine binding 
complex. 

Alternatively, a drug can be specifically designed to bind to the ZA loop of a 
bromodomain for example, such as the P/CAF bromodomain, and be assayed through 
NMR based methodology [Shuker et ai. Science 274:1531-1534 (1996) hereby 
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incorporated by reference in its entirety.] In a particular embodiment, analogs of the 
binding partner of the bromodomain can be used in this analysis. One such peptide has 
the amino acid sequence of SEQ ID N0:4. In another embodiment of this type, the 
peptide has the amino acid sequence of SEQ ID NO:5. In another such embodiment of 
this type, the peptide has the amino acid sequence of SEQ ID NO:6. 

The assay begins with contacting a compound with a '^N-labeled bromodomain. 
Binding of the compound with the ZA loop of the bromodomain can be determined by 
monitoring the '^N- or *H-amide chemical shift changes in two dimensional *^N- 
hetero nuclear single-quantum correlation (*^N-HSQC) spectra upon the addition of the 
compound to the *^N-labeled bromodomain. Since these spectra can be rapidly 
obtained, it is feasible to screen a large number of compounds [Shuker et al. Science 
274:1531-1534 (1996)]. A compound is identified as apotential ligand if it binds to 
the ZA loop of the bromodomain. In a further embodiment, the potential ligand can 
then be used as a model structure, and analogs to the compound can be obtained (e.g, 
fi-om the vast chemical libraries commercially available, or altematively through de 
novo synthesis). The analogs are then screened for their ability to bind the ZA loop of 
the bromodomain thus to obtain a ligand. An analog of the potential ligand is chosen 
as a ligand when it binds to the ZA loop of the bromodomain with a higher binding 
affinity than the potential ligand. In a preferred embodiment of this type the analogs 
are screened by monitoring the ^^N- or *H-amide chemical shift changes in two 
dimensional ^^N-heteronuclear single-quantum correlation ('^N-HSQC) spectra upon 
the addition of the analog to the ^^N-labeled bromodomain as described above. 

In another further embodiment, compounds are screened for binding to two nearby 
sites on the bromodomain. In this case, a compound that binds a first site of the 
bromodomain does not bind a second nearby site. Binding to the second site can be 
determined by monitoring changes in a different set of amide chemical shifts in either 
the original screen or a second screen conducted in the presence of a ligand (or 
potential ligand) for the first site. From an analysis of the chemical shift changes the 
approximate location of a potential ligand for the second site is identified. 
Optimization of the second ligand for binding to the site is then carried out by 
screening structurally related compounds {e.g., analogs as described above). When 
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ligands for the first site and the second site are identified, their location and orientation 
in the ternary complex can be determined experimentally either by NMR spectroscopy 
or X-ray crystallography. On the basis of this structural information, a linked 
compound is synthesized in which the ligand for the first site and the ligand for the 
5 second site are linked. In a preferred embodiment of this type the two Hgands are 
covalently linked. This linked compound is tested to determine if it has a higher 
binding affinity for the bromodomain than either of the two individual ligands. A 
linked compound is selected as a ligand when it has a higher binding affinity for the 
bromodomain than either of the two ligands. In a preferred embodiment the affinity of 
10 the linked compound with the bromodomain is determined monitoring the *^N- or *H- 
amide chemical shifl: changes in two dimensional ^^N-heteronuclear single-quantum 
correlation (*^N-HSQC) spectra upon the addition of the linked compound to the ^^N- 
labeled bromodomain as described above. 

15 A larger linked compound can be constructed in an analogous manner, e.g., linking 
three ligands which bind to three nearby sites on the bromodomain to form a 
multilinked compound that has an even higher affinity for the bromodomain than the 
linked compound. 

20 Identification of New Bromodomains 

By disclosing that protein bound acetyl-lysine is a binding partner for bromodomains, 
the present invention provides a method of identifying novel proteins that contain 
bromodomains. In short, a protein fi*agment or analog thereof comprising an acetyl- 

25 lysine can be used as bait to identify a binding partner that comprises a bromodomain. 
Any one of a number of procedures can be carried out to identify such a binding 
partner. One such assay comprises passing a cell extract over the bait peptide which is 
attached to a solid support. After washing the solid support to remove any non- 
specific binders, the bromodomain containing protein can be eluted fi-om the solid 

30 support with an appropriate eluant. In a particular embodiment, the free bait peptide 
can be used in the elution. Other methodology includes the use of a yeast two-hybrid 
system, a GST pull down assay, ELISA, immunometric assays, and a modification of 
the CORT procedure of Schlessinger et al., (US Patent No. 5,858,686, Issued on 
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January 12, 1999 which is hereby incorporated by reference in its entirety) for use with 
the bromodomain-acetyl-lysine binding complex. 

Labels : 

Suitable labels include enzymes, fluorophores (e.g., fluorescein isothiocyanate (FITC), 
phycoerythrin (PE), Texas red (TR), rhodamine, free or chelated lanthanide series salts, 
especially Eu"'^, to name a few fluorophores), chromophores, radioisotopes, chelating 
agents, dyes, colloidal gold, latex particles, ligands (e.g., biotin), and 
chemiluminescent agents. When a control marker is employed, the same or different 
labels may be used for the test and control marker gene. 

In the instance where a radioactive label, such as the isotopes ^H, *^C, ^^P, ^^S, ^^Cl, 
''Cr, ''Co, ''Co, ''Fe, ''Y, and ^^^Re are used, known currently available 

counting procedures may be utilized. In the instance where the label is an enzyme, 
detection may be accomplished by any of the presently utilized colorimetric, 
spectrophotometric, fluorospectrophotometric, amperometric or gasometric techniques 
known in the art. 

Direct labels are one example of labels which can be used according to the present 
invention. A direct label has been defined as an entity, which in its natural state, is 
readily visible, either to the naked eye, or with the aid of an optical filter and/or appHed 
stimulation, e.g. U.V. light to promote fluorescence. Among examples of colored 
labels, which can be used according to the present invention, include metallic sol 
particles, for example, gold sol particles such as those described by Leuvering (U.S. 
Patent 4,313,734); dye sole particles such as described by Gribnau et al (U.S. Patent 
4,373,932 and May et al (WO 88/08534); dyed latex such as described by May, supra, 
Snyder (EP-A 0 280 559 and 0 281 327); or dyes encapsulated in liposomes as 
described by Campbell et al. (U.S. Patent 4,703,017). Other direct labels include a 
radionucleotide, a fluorescent moiety or a luminescent moiety. In addition to these 
direct labeling devices, indirect labels comprising enzymes can also be used according 
to the present invention. Various types of enzyme linked immunoassays are well 
known in the art, for example, alkaline phosphatase and horseradish peroxidase. 
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lysozyme, glucose-6-phosphate dehydrogenase, lactate dehydrogenase, urease, these 
and others have been discussed in detail by Eva Engvall in Enzyme Immunoassay 
ELISA and EMIT in Methods in Enzymology, 70:419-439 (1980) and in U.S. Patent 
4,857,453. 

Suitable enzymes include, but are not limited to, alkaline phosphatase, p-galactosidase, 
green fluorescent protein and its derivatives, luciferase, and horseradish peroxidase. 

Other labels for use in the invention include magnetic beads or magnetic resonance 
imaging labels. 

Antibodies to Portions of the Bromodomain that Interact with Acetvl-Lvsine 

According to the present invention, the bromodomains, and more particularly the ZA 
loops of the bromodomains and fragments thereof can be produced by a recombinant 
source, or through chemical synthesis, or through the modification of these peptides 
and fragments; and derivatives or analogs theregf, including fusion proteins, may be 
used as an immunogen to generate antibodies that specifically interfere v^ith the 
formation of the bromodomain-acetyl-lysine binding complex. Similarly, antibodies 
can be raised against peptides that comprise one or more acetyl-lysine residues which 
also interfere with the formation of the bromodomain-acetyl-lysine binding complex. 
Such antibodies include but are not limited to polyclonal, monoclonal, chimeric, single 
chain. Fab fragments, and a Fab expression library. 

Various procedures known in the art may be used for the production of the polyclonal 
antibodies. For the production of antibody, various host animals can be immunized by 
injection with the peptide having the amino acid sequence of SEQ ID NO:3, for 
example, or a derivative {e.g. , or fusion protein) thereof, including but not limited to 
rabbits, mice, rats, sheep, goats, etc. In one embodiment, the peptide can be 
conjugated to an immunogenic carrier, e.^., bovine serum albumin (BSA) or keyhole 
limpet hemocyanin (KLH). Various adjuvants may be used to increase the 
immunological response, depending on the host species, including but not limited to 
Freund*s (complete and incomplete), mineral gels such as aluminum hydroxide, surface 
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active substances such as lysolecithin, pliironic polyols, polyanions, peptides, oil 
emulsions, keyhole limpet hemocyanins, dinitrophenol, and potentially useful human 
adjuvants such as BCG (bacille Calmette-Guerin) and Corynebacterium parvum. 

For preparation of monoclonal antibodies directed toward the peptides or protein 
fragments of the present invention, or analog, or derivative thereof, any technique that 
provides for the production of antibody molecules by continuous cell lines in culture 
may be used. These include but are not limited to the hybridoma technique originally 
developed by Kohler and Milstein [Nature, 256:495-497 (1975)], as well as the trioma 
technique, the human B-cell hybridoma technique [Kozbor et al. Immunology Today, 
4:72 (1983); Cote et aL, Proc, Natl Acad. Sci. U.S.A,, 80:2026-2030 (1983)], and the 
EBV-hybridoma technique to produce human monoclonal antibodies [Cole et al, in 
Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96 (1985)], In 
an additional embodiment of the invention, monoclonal antibodies can be produced in 
germ-free animals utilizing technology described in PCT/US90/02545. In fact, 
according to the invention, techniques developed for the production of "chimeric 
antibodies" [Morrison et al, J, BacterioL, 159:870 (1984); Neuberger et aL, Nature, 
312:604-608 (1984); Takeda et aL, Nature, 314:452-454 (1985)] by sphcing the genes 
from a mouse antibody molecule specific for the peptide having the amino acid 
sequence of SEQ ID NO:3, for example, together with genes from a human antibody 
molecule of appropriate biological activity can be used; such antibodies are within the 
scope of this invention. Such human or humanized chimeric antibodies are preferred 
for use in therapy of human diseases or disorders (described infra), since the human or 
humanized antibodies are much less likely than xenogenic antibodies to induce an 
immune response, in particular an allergic response, themselves. 

According to the invention, techniques described for the production of single chain 
antibodies [U.S. Patent Nos. 5,476,786 and 5,132,405 to Huston; U.S. Patent 
4,946,778] can be adapted to produce specific single chain antibodies. An additional 
embodiment of the invention utilizes the techniques described for the construction of 
Fab expression libraries [Huse et ai. Science, 246:1275-1281 (1989)] to allow rapid 
and easy identification of monoclonal Fab fragments with the desired specificity. 
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Antibody fragments which contain the idiotype of the antibody molecule can be 
generated by known techniques. For example, such fragments include but are not 
limited to: the F(ab')2 fragment which can be produced by pepsin digestion of the 
antibody molecule; the Fab' fragments which can be generated by reducing the 
disulfide bridges of the F(ab')2 fragment, and the Fab fragments which can be 
generated by treating the antibody molecule with papain and a reducing agent. 

In the production of antibodies, screening for the desired antibody can be accomplished 
by techniques known in the art, e,g,, radioimmunoassay, ELISA (enzyme-linked 
immunosorbant assay), "sandwich" immunoassays, immunoradiometric assays, gel 
diffusion precipitin reactions, immunodiffusion assays, in situ immunoassays (using 
colloidal gold, enzyme or radioisotope labels, for example), western blots, precipitation 
reactions, agglutination assays (e.g., gel agglutination assays, hemagglutination 
assays), complement fixation assays, immunofluorescence assays, protein A assays, 
and immunoelectrophoresis assays, etc. In one embodiment, antibody binding is 
detected by detecting a label on the primary antibody. In another embodiment, the 
primary antibody is detected by detecting binding of a secondary antibody or reagent to 
the primary antibody. In a fiirther embodiment, the secondary antibody is labeled. 
Many means are known in the art for detecting binding in an immunoassay and are 
within the scope of the present invention. For example, to select antibodies which 
recognize a specific epitope of a ZA loop of a bromodomain, for example, one may 
assay generated hybridomas for a product which binds to a bromodomain fragment 
containing such an epitope and choose those which do not cross-react with 
bromodomain fragments that do not include that epitope. 

In a specific embodiment, antibodies that interfere with the formation of the 
bromodomain-acetyl-lysine complex can be generated. Such antibodies can be tested 
using the assays described and could potentially be used in anti-cancer therapies. 

Administration 



According to the invention, the component or components of a therapeutic 
composition, e.g., an agent of the invention that interferes with the bromodomain- 
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acetyl-lysine binding complex such as the peptide having the amino acid sequence of 
SEQ ID NOs:4, 5, or 6 and a phamiaceutically acceptable carrier, may be introduced 
parenterally, transmucosally, e.g., orally, nasally, or rectally, or transdermally. 
Preferably, administration is parenteral, e,g., via intravenous injection, and also 
5 including, but is not limited to, intra-arteriole, intramuscular, intradermal, 
subcutaneous, intraperitoneal, intraventricular, and intracranial administration. 

In a preferred aspect, the agent of the present invention can cross cellular and nuclear 
membranes, which would allow for intravenous or oral administration. Strategies are 
10 available for such crossing, including but not limited to, increasing the hydrophobic 
nature of a molecule; introducing the molecule as a conjugate to a carrier, such as a 
ligand to a specific receptor, targeted to a receptor; and the like. 

The present invention also provides for conjugating targeting molecules to such an 
15 agent. "Targeting molecule*' as used herein shall mean a molecule which, when 
administered in vivo, localizes to desired location(s). In various embodiments, the 
targeting molecule can be a peptide or protein, antibody, lectin, carbohydrate, or 
steroid. In one embodiment, the targeting molecule is a peptide ligand of a receptor on 
the target cell. In a specific embodiment, the targeting molecule is an antibody. 
20 Preferably, the targeting molecule is a monoclonal antibody. In one embodiment, to 
facilitate crosslinking the antibody can be reduced to two heavy and light chain 
heterodimers, or the F(ab')2 fragment can be reduced, and crosslinked to the agent via 
the reduced sulfhydryl. Antibodies for use as targeting molecule are specific for a cell 
surface antigen. 

25 

In another embodiment, the therapeutic compound can be delivered in a vesicle, in 
particular a liposome [see Langer, Science, 249:1527-1533 (1990); Treat et aL, in 
Liposomes in the Therapy of Infectious Disease and Cancer, Lopez-Berestein and 
Fidler (eds.), Liss: New York, pp. 353-365 (1989); Lopez-Berestein, ibid,, pp. 317- 
30 327; see generally ibid.]. 

In yet another embodiment, the therapeutic compound can be delivered in a controlled 
release system. For example, the agent may be administered using intravenous 
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infusion, an implantable osmotic pump, a transdemial patch, liposomes, or other 
modes of administration. In one embodiment, a pump may be used [see Langer, supra; 
Sefton, CRC CriL Ref, Biomed, Eng., 14:201 (1987); Buchwald et aL, Surgery, 88:507 
(1980); Saudek et aL, N. Engl. J. Med, 321:574 (1989)]. In another embodiment, 
polymeric materials can be used [see Medical Applications of Controlled Release, 
Langer and Wise (eds.), CRC Press: Boca Raton, Florida (1974); Controlled Drug 
Bioavailability, Drug Product Design and Performance, Smolen and Ball (eds.), 
Wiley: New York (1984); Ranger and Peppas, J. Macromol. Sci. Rev. MacromoL 
Chem., 23:61 (1983); see also Levy et al. Science, 228: 190 (1985); During et al.,Ann, 
Neurol., 25:351 (1989); Howard et ai, J. Neurosurg, 71:105 (1989)], In yet another 
embodiment, a controlled release system can be placed in proximity of the therapeutic 
target, i.e., the bone marrow, thus requiring only a fraction of the systemic dose [see, 
e,g., Goodson, in Medical Applications of Controlled Release, supra, vol. 2, pp. 115- 
138 (1984)]. Other controlled release systems are discussed in the review by Langer 
[Science, 249:1527-1533 (1990)]. 

Pharmaceutical Compositions. In yet another aspect of the present invention, provided 
are pharmaceutical compositions of the above. Such pharmaceutical compositions may 
be for administration for injection, or for oral, pulmonary, nasal or other forms of 
administration. In general, comprehended by the invention are pharmaceutical 
compositions comprising effective amounts of a low molecular weight component or 
components, or derivative products, of the invention together with pharmaceutically 
acceptable diluents, preservatives, solubilizers, emulsifiers, adjuvants and/or carriers. 
Such compositions include diluents of various buffer content {e.g., Tris-HCl, acetate, 
phosphate), pH and ionic strength; additives such as detergents and solubilizing agents 
{e.g., Tween 80, Polysorbate 80), anti-oxidants {e.g., ascorbic acid, sodium 
metabisulfite), preservatives {e.g., Thimersol, benzyl alcohol) and bulking substances 
{e.g., lactose, mannitol); incorporation of the material into particulate preparations of 
polymeric compounds such as polylactic acid, polyglycolic acid, etc. or into liposomes. 
Hylauronic acid may also be used. Such compositions may influence the physical 
state, stability, rate of in vivo release, and rate of in vivo clearance of the present 
proteins and derivatives. See, e.g.. Remington's Pharmaceutical Sciences, 18th Ed. 
[1990, Mack Publishing Co., Easton, PA 18042] pages 1435-1712 which are herein 
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incorporated by reference. The compositions may be prepared in liquid form, or may 
be in dried powder, such as lyophilized form. 

Oral Delivery. Contemplated for use herein are oral solid dosage forms, which are 
5 described generally in Remington's Pharmaceutical Sciences, 18th Ed. 1990 (Mack 
Publishing Co. Easton PA 18042) at Chapter 89, which is herein incorporated by 
reference. Solid dosage forms include tablets, capsules, pills, troches or lozenges, 
cachets or pellets. Also, liposomal or proteinoid encapsulation may be used to 
formulate the present compositions (as, for example, proteinoid microspheres reported 

10 in U.S. Patent No. 4,925,673). Liposomal encapsulation may be used and the 

liposomes may be derivatized with various polymers (e.^., U.S. Patent No. 5,013,556). 
A description of possible solid dosage forms for the therapeutic is given by Marshall, 
K. In: Modern Pharmaceutics Edited by G.S. Banker and C.T. Rhodes Chapter 10, 
1979, herein incorporated by reference. In general, the formulation will include an 

15 agent of the present invention (or chemically modified forms thereof) and inert 

ingredients which allow for protection against the stomach environment, and release of 
the biologically active material in the intestine. 

Also specifically contemplated are oral dosage forms of the above derivatized 
20 component or components. The component or components may be chemically 

modified so that oral delivery of the derivative is efficacious. Generally, the chemical 
modification contemplated is the attachment of at least one moiety to the component 
molecule itself, where said moiety permits (a) inhibition of proteolysis; and (b) uptake 
into the blood stream from the stomach or intestine. Also desired is the increase in 
25 overall stability of the component or components and increase in circulation time in the 
body. An example of such a moiety is polyethylene glycol. 

For the component (or derivative) the location of release may be the stomach, the small 
intestine (the duodenum, the jejunum, or the ileum), or the large intestine. One skilled 
30 in the art has available formulations which will not dissolve in the stomach, yet will 
release the material in the duodenum or elsewhere in the intestine. Preferably, the 
release will avoid the deleterious effects of the stomach environment, either by 
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protection of the protein (or derivative) or by release of the biologically active material 
beyond the stomach environment, such as in the intestine. 

The therapeutic can be included in the formulation as fine multi-particulates in the 
5 form of granules or pellets of particle size about 1 mm. The formulation of the 
material for capsule administration could also be as a pow^der, lightly compressed 
plugs or even as tablets. The therapeutic could be prepared by compression. 

One may dilute or increase the volume of the therapeutic with an inert material. These 
10 diluents could include carbohydrates, especially mannitol, a-lactose, anhydrous lactose, 
cellulose, sucrose, modified dextrans and starch. Certain inorganic salts may be also 
be used as fillers including calcium triphosphate, magnesium carbonate and sodium 
chloride. Some commercially available diluents are Fast-Flo, Emdex, STA-Rx 1500, 
Emcompress and AvicelL 

15 

Disintegrants may be included in the formulation of the therapeutic into a solid dosage 
form. Materials used as disintegrates include but are not limited to starch, including 
the commercial disintegrant based on starch, Explotab. Binders also may be used to 
hold the therapeutic agent together to form a hard tablet and include materials from 
20 natural products such as acacia, tragacanth, starch and gelatin. 

An anti-frictional agent may be included in the formulation of the therapeutic to 
prevent sticking during the formulation process. Lubricants may be used as a layer 
between the therapeutic and the die wall. Glidants that might improve the flow 
25 properties of the drug during formulation and to aid rearrangement during compression 
also might be added. The glidants may include starch, talc, pyrogenic silica and 
hydrated silicoaluminate. 

In addition, to aid dissolution of the therapeutic into the aqueous environment a 
30 surfactant might be added as a wetting agent. Additives which potentially enhance 
uptake of the protein (or derivative) are for instance the fatty acids oleic acid, linoleic 
acid and linolenic acid. 
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Nasal Delivery. Nasal delivery of an agent of the present invention (or derivative) is 
also contemplated. Nasal delivery allows the passage of a peptide, for example, to the 
blood stream directly after administering the therapeutic product to the nose, without 
the necessity for deposition of the product in the lung. Formulations for nasal delivery 
5 include those with dextran or cyclodextran. 

Transdermal administration. Various and numerous methods are known in the art for 
transdermal administration of a drug, e.g.^ via a transdermal patch. Transdermal 
patches are described in for example, U.S. Patent No. 5,407,713, issued April 18, 1995 

10 to Rolando et al.; U.S. Patent No. 5,352,456, issued October 4, 1004 to Fallon et al\ 
U.S. Patent No. 5,332,213 issued August 9, 1994 to D*Angelo et al\ U.S. Patent No. 
5,336,168, issued August 9, 1994 to SibaUs; U.S. Patent No. 5,290,561, issued March 
1, 1994 to Farhadieh et al \ U.S. Patent No. 5,254,346, issued October 19, 1993 to 
Tucker et al \ U.S. Patent No. 5,164,189, issued November 17, 1992 to Berger et al \ 

15 U.S. Patent No. 5,163,899, issued November 17, 1992 to Sibahs; U.S. Patent Nos. 
5,088,977 and 5,087,240, both issued February 18, 1992 to Sibalis; U.S. Patent No. 
5,008,1 10, issued April 16, 1991 to Benecke et al \ and U.S. Patent No. 4,921,475,. 
issued May 1, 1990 to Sibalis, the disclosure of each of which is incorporated herein 
by reference in its entirety. 

20 

It can be readily appreciated that a transdermal route of administration may be 
enhanced by use of a dermal penetration enhancer, eg-., such as enhancers described in 
U.S. Patent No. 5,164,189 (supra), U.S. Patent No. 5,008,1 10 (supra), and U.S. Patent 
No. 4,879,1 19, issued November 7, 1989 to Aruga et al., the disclosure of each of 
25 which is incorporated herein by reference in its entirety. 

Pulmonary Delivery. Also contemplated herein is pulmonary delivery of the 
pharmaceutical compositions of the present invention. A pharmaceutical composition 
of the present invention is delivered to the lungs of a mammal while inhaling and 
30 traverses across the lung epithelial lining to the blood stream. Other reports of this 
include Adjei et al. [Pharmaceutical Research, 7:565-569 (1990); Adjei et al., 
International Journal of Pharmaceutics, 63:135-144 (1990) (leuprolide acetate); 
Braquet et al. Journal of Cardiovascular Pharmacology, 13(suppi. 5): 143-146 (1989) 
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(endothelin-1); Hubbard et al. Annals of Internal Medicine , Vol. Ill, pp. 206-212 
(1989) (a 1 -antitrypsin); Smith et al, J, Clin, Invest., 84:1145-1146 (1989) (a-1- 
proteinase); Oswein et al, "Aerosolization of Proteins", Proceedings of Symposium on 
Respiratory Drug Delivery II, Keystone, Colorado, March, (1990) (recombinant human 
growth hormone); Debs et al., J, Immunol, 140:3482-3488 (1988) (interferon-y and 
tumor necrosis factor alpha); Platz et al, U.S. Patent No. 5,284,656 (granulocyte 
colony stimulating factor)]. A method and composition for pulmonary delivery of 
drugs for systemic effect is described in U.S. Patent No. 5,451,569, issued September 
19, 1995 to Wong etal. 

A subject in whom administration of an agent of the present invention is an effective 
therapeutic regiment for cancer, for example, is preferably a human, but can be any 
animal. Thus, as can be readily appreciated by one of ordinary skill in the art, the 
methods and pharmaceutical compositions of the present invention are particularly 
suited to administration to any animal, e.g., for veterinary medical use, particularly for 
a mammal, and including, but by no means limited to, domestic animals, such as feline 
or canine subjects, farm animals, including bovine, equine, caprine, ovine, and porcine 
subjects, wild animals (whether in the wild or in a zoological garden), research 
animals, such as mice, rats, rabbits, goats, sheep, pigs, dogs, cats, avian species, such 
as chickens, turkeys, and songbirds. 

The present invention may be better understood by reference to the following non- 
limiting Example, which is provided as exemplary of the invention. The following 
example is presented in order to more fiilly illustrate the preferred embodiments of the 
invention. It should in no way be construed, however, as limiting the broad scope of 
the invention. 
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EXAMPLE 

STRUCTURE AND LIGAND OF A HISTQNF 
ACETYLTRANSFERASE BROMODOMATN 

Introduction 

The bromodomain is a protein motif comprising approximately 1 10 amino acids that is 
found in practically all nuclear histone acetyltransferases (HATs) [Jeanmougin et aL, 
Trends in Biochemical Sciences, 22:151-153 (1997)]. However, despite the seemingly 
requisite occurrence of this motif in HATs, their role in these enzymes is unknown. 
Indeed, although this motif has also been identified in other chromatin proteins, 
heretofore not even one binding partner for a bromodomain had been identified. 

Materials and Methods 
Sample preparation: The bromodomain of P/CAF (residues 719-832 of SEQ ID NO:2) 
was subcloned into the pET14b expression vector (Novagen) and expressed in 
Escherichia coli BL21(DE3) cells. Uniformly *^N- and *^N/^^C-labelled proteins were 
prepared by growing bacteria in a minimal medium containing *^NH4C1 with or 
without ^^C^-glucose. A uniformly *^N/^^C-labelled and fi-actionally deuterated protein 
sample was prepared by growing the cells in 75% ^H20. The bromodomain was 
purified by affinity chromatography on a nickel-IDA column (Invitrogen) followed by 
the removal of poly-His tag by thrombin cleavage. The final purification of the protein 
was achieved by size-exclusion chromatography. The acetyl-lysine-containing 
peptides were prepared on a MilliGen 9050 peptide synthesizer (Perkin Elmer) using 
Fmoc/HBTU chemistry. Acetyl-lysine was incorporated using the reagent 
Fmoc-Ac-Lys with HBTU/DIPEA activation. NMR samples contained approximately 
1 mM protein in lOOmM phosphate buffer of pH 6.5 and 5mM perdeuterated DTT and 
0.5mM EDTA in Hp/'H^O (9/1) or 'H2O. 

NMR spectroscopy: All NMR spectra were acquired at 30'' C on a Bruker DRX600 or 
DRX500 spectrometer. The backbone assignments of the *H, ^^C, and resonances 
were achieved using deuterium-decoupled triple-resonance experiments of HNCACB 
and HN(CO)CACB [Yamazaki et al.,J, Am, Chem. Soc. 116:1 1655-1 1666 (1994)] 
recorded using the uniformly *^N/^^C-labeled and fi-actionally deuterated protein. The 
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side-chain atoms were assigned from 3D HCCH-TOCSY [Clore and Gronenbom, 
Meth. Enzymol 239:249-363 (1994)] and (H)C(CO)NH-TOCSY [Logan et al, J, 
Biolmol NMR 3:225-231 (1993)] data collected on the unifomily '^N/*^C-labeled 
protein. Stereospecific assignments of methyl groups of the Val and Leu residues were 
5 obtained using a fractionally *^C-labeled sample [Neri et al. Biochemistry 28:7510- 
7516 (1989)]. The NOE-derived distance restraints were obtained from *^N- or 
^^C-edited 3D NOESY spectra, jangle restraints were determined based on the 
Vh>j,h" coupling constants measured in a 3D HNHA spectrum [Clore and Gronenbom, 
Meth. Enzymol. 239:249-363 (1994)]. Slowly exchanging amide protons were 

10 identified from a series of 2D *^N-HSQC spectra recorded after the H2O buffer was 
changed to a ^HjO buffer. The intermolecular NOEs used in defining the structure of 
the bromodomain/Ac-histamine complex were detected in *'*C-edited (Fy), 
^^C/*^N-filtered (F^) 3D NOESY spectrum [Clore and Gronenbom, Meth. EnzymoL 
239:249-363 (1994)]. All NMR spectra were processed with the NMRPipe/NMRDraw 

15 programs and analyzed using NMRView [Johnson and Blevins, J. BiomoL, NMR 
4:603-614(1994)]. 

Structure calculations'. Structures of the bromodomain were calculated with a distance 
geometry/simulated annealing protocol using the X-PLOR program [Brunger, A. X- 

20 PLOR Version 3.1: A system for X-Ray crystallography and NMR, Yale University 
Press, New Haven, CT, (1993)]. A total of 1324 manually assigned NOE-derived 
distance restraints were obtained from the *^N- and *^C-edited NOE spectra. Further 
analysis of the NOE spectra was carried out by the iterative automated assignment 
procedure using AKLA [Nilges and O'Donoghue, Prog. NMR Spectroscopy 32:107-139 

25 ( 1 998)], which integrates with X-PLOR for stmcture calculations. A total of 1 5 1 9 
unambiguous and 590 ambiguous distance restraints were identified from the NOE 
data by ARIA, many of which were checked and confirmed manually. The 
ARIA-assigned distance restraints were in agreement with the structures calculated 
using only the manually assigned NOE distance restraints, 28 hydrogen-bond distance 

30 restraints for 14 hydrogen bonds, and 54 jangle restraints. The final structure 

calculations employed a total of 3515 NMR experimental restraints obtained from the 
manual and the ARIA-assisted assignments, 2843 of which were unambiguously 
assigned NOE-derived distance restraints that comprise of 1077 intra-residue, 621 
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sequential, 550 medium-range, and 595 long-range NOEs. For the ensemble of the 
final 30 structures, no distance and torsional angle restraints were violated by more 
than 0.3 A and 5°, respectively. The total, distance violation, and dihedral violation 
energies were 178.7 ± 2.4 kcal mol"*, 41.6 ± 0.9 kcal mol ^ and 0.50 ± 0.06 kcal mol'*, 

5 respectively. The Lennard- Jones potential which was not used during any refinement 
stage, was -526.2 ± 16.8 kcal mol * for the final structures. Ramachandran plot analysis 
of the final structures (residues 727-828) with Procheck-NMR [Laskowski et aL, J, 
Biolmol. NMR 8:477-486 (1996)] showed that 71.0 ± 0.6%, 23.8 ± 0.6%, 3.5 ± 0.2%, 
and 1.7 ± 0.2% of the non-Gly and non-Pro residues were in the most favorable, 

0 additionally allowed, generously allowed, and disallowed regions, respectively. The 
corresponding values for the residues in the four a-helices (residues 727-743, 770-776, 
785-802, and 807-827) were 88.9 ± 0.4%, 1 1.0 ± 0.4%, 0.1 ± 0.1%, and 0.0 ± 0.0%, 
respectively. The structure of the bromodomain/acetyl-histamine complex was 
determined using the fi-ee form structure and additional 25 intermolecular and 5 

5 intra-ligand NOE-derived distance restraints. 

Site-directed mutagenesis: Mutant proteins were prepared using the QuickChange 
site-directed mutagenesis kit (Stratagene). The presence of appropriate mutations was 
confirmed by DNA sequencing. 

Ligand titration: Ligand titration experiments were performed by recording a series of 
2D *^N- and *^C-HSQC spectra on the uniformly *^N-, and *^N/*^C-labelled 
bromodomain ('-0.3mM), respectively, in the presence of different amounts of ligand 
concentration ranging fi-om 0 to approximately 2.0 mM. The protein sample and the 
stock solutions of the ligands were all prepared in the same aqueous buffer containing 
lOOmM phosphate and 5mM perdeuterated DTT at pH 6.5. 

The full length nucleic acid sequence of the human p300/CBP-associated factor 
(P/CAF) was obtained firom GenBank. Accession No: U57317.2 (SEQ ID NO:l) : 

1 9g99ccgcgt cgacgcggaa aagaggccgt ggggggcctc ccagcgctgg cagacaccgt 

61 gaggctggca gccgccggca cgcacaccta gtccgcagtc ccgaggaaca tgtccgcagc 

121 cagggcgcgg agcagagtcc cgggcaggag aaccaaggga gggcgtgtgc tgtggcggcg 

181 gcggcagcgg cagcggagcc gctagtcccc tccctcctgg gggagcagct gccgccgctg 

241 ccgccgccgc caccaccatc agcgcgcggg gcccggccag agcgagccgg gcgagcggcg 
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i 

3 01 cgctaggggg agggcggggg cggggagggg ggtgggcgaa gggggcggga gggcgtgggg 
361 ggagggtctc gctctcccga ctaccagagc ccgagggaga ccctggcggc ggcggcggcg 
421 cctgacactc ggcgcctcct gccgtgctcc ggggcggcat gtccgaggct ggcggggccg 
481 ggccgggcgg ctgcggggca ggagccgggg caggggccgg gcccggggcg ctgcccccgc 
541 agcctgcggc gcttccgccc gcgcccccgc agggctcccc ctgcgccgct gccgccgggg • 
601 gctcgggcgc ctgcggtccg gcgacggcag tggctgcagc gggcacggcc gaaggaccgg 
661 gaggcggtgg ctcggcccga atcgccgtga agaaagcgca actacgctcc gctccgcggg 
721 ccaagaaact ggagaaactc ggagtgtact ccgcctgcaa ggccgaggag tcttgtaaat 
781 gtaatggctg gaaaaaccct aacccctcac ccactccccc cagagccgac ctgcagcaaa 
841 taattgtcag tctaacagaa tcctgtcgga gttgtagcca tgccctagct gctcatgttt 
901 cccacctgga gaatgtgtca gaggaagaaa tgaacagact cctgggaata- gtattggatg 
961 tggaatatct ctttacctgt gtccacaagg aagaagatgc agataccaaa caagtttatt 
1021 tctatctatt taagctcttg agaaagtcta ttttacaaag aggaaaacct gtggttgaag 
1081 gctctttgga aaagaaaccc ccatttgaaa aacctagcat tgaacagggt gtgaataact 
1141 ttgtgcagta caaatttagt cacctgccag caaaagaaag gcaaacaata gttgagttgg 
1201 caaaaatgtt cctaaaccgc atcaactatt ggcatctgga ggcaccatct caacgaagac 
1261 tgcgatctcc caatgatgat atttctggat acaaagagaa ctacacaagg tggctgtgtt 
13 21 actgcaacgt gccacagttc tgcgacagtc tacctcggta cgaaaccaca caggtgtttg 
1381 ggagaacatt gcttcgctcg gtcttcactg ttatgaggcg acaactcctg gaacaagcaa 
1441 gacaggaaaa agataaactg cctcttgaaa aacgaactct aatcctcact catttcccaa 
1501 aatttctgtc catgctagaa gaagaagtat atagtcaaaa ctctcccatc tgggatcagg 
1561 attttctctc agcctcttcc agaaccagcc agctaggcat ccaaacagtt atcaatccac 
1621 ctcctgtggc tgggacaatt tcatacaatt caacctcatc ttcccttgag cagccaaacg 
1681 cagggagcag cagtcctgcc tgcaaagcct cttctggact tgaggcaaac ccaggagaaa 
1741 agaggaaaat gactgattct catgttctgg aggaggccaa gaaaccccga gttatggggg 
1801 atattccgat ggaattaatc aacgaggtta tgtctaccat cacggaccct gcagcaatgc 
1861 ttggaccaga gaccaatttt ctgtcagcac actcggccag ggatgaggcg gcaaggttgg 
1921 aagagcgcag gggtgtaatt gaatttcacg tggttggcaa ttccctcaac cagaaaccaa 
1981 acaagaagat cctgatgtgg ctggttggcc tacagaacgt tttctcccac cagctgcccc 
2041 gaatgccaaa agaatacatc acacggctcg tctttgaccc gaaacacaaa acccttgctt 
2101 taattaaaga tggccgtgtt attggtggta tctgtttccg tatgttccca tctcaaggat 
2161 tcacagagat tgtcttctgt gctgtaacct caaatgagca agtcaagggc tatggaacac 
2221 acctgatgaa tcatttgaaa gaatatcaca taaagcatga catcctgaac ttcctcacat 
2281 atgcagatga atatgcaatt ggatacttta agaaacaggg tttctccaaa gaaattaaaa 
2341 tacctaaaac caaatatgtt ggctatatca aggattatga aggagccact ttaatgggat 
2401 gtgagctaaa tccacggatc ccgtacacag aattttctgt catcattaaa aagcagaagg 
24 61 agataattaa aaaactgatt gaaagaaaac aggcacaaat tcgaaaagtt taccctggac 
2521 tttcatgttt taaagatgga gttcgacaga ttcctataga aagcattcct ggaattagag 
2581 agacaggctg gaaaccgagt ggaaaagaga aaagtaaaga gcccagagac cctgaccagc 
2641 tttacagcac gctcaagagc atcctccagc aggtgaagag ccatcaaagc gcttggccct 
2 701 tcatggaacc tgtgaagaga acagaagctc caggatatta tgaagttata aggttcccca 
2761 tggatctgaa aaccatgagt gaacgcctca agaataggta ctacgtgtct aagaaattat 
2821 tcatggcaga cttacagcga gtctttacca attgcaaaga gtacaacgcc gctgagagtg 
2881 aatactacaa atgtgccaat atcctggaga aattcttctt cagtaaaatt aaggaagctg 
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2941 gattaattga caagtgattt tttttccccc tctgcttctt agaaactcac caagcagtgt 
3001 gcctaaagca aggt 

The full length protein sequence of the human p300/CBP-associated factor (P/CAF) 
was obtained from GenBank. Accession No: U57317.2, (SEQ ID NO:2): 

1 MSEAGGAGPG GCGAGAGAGA GPGALPPQPA ALPPAPPQGS PCAAAAGGSG ACGPATAVAA 
61 AGTAEGPGGG GSARIAVKKA QLRSAPRAKK LEKLGVYSAC KAEESCKCNG WKNPNPSPTP 
121 PRADLQQIIV SLTESCRSCS HALAAHVSHL ENVSEEEMNR LLGIVLDVEY LFTCVHKEED 
181 ADTKQVYFYL FKLLRKSILQ RGKPWEGSL EKKPPFEKPS lEQGVNNFVQ YKFSHLPAKE 
241 RQTIVEIiAKM FLNRINYWHL, EAPSQRRLRS PNDDISGYPCE NYTRWLCYCN VPQFCDSLPR 
301 YETTQVFGRT LLRSVFTVMR RQLLEQARQE KDKLPLEKRT LILTHFPKFL SMLEEEVYSQ 
361 NSPIWDQDFL SASSRTSQLG IQTVINPPPV AGTISYNSTS SSLEQPNAGS SSPACKASSG 
421 LEANPGEKRK MTDSHVLEEA .KKPRVMGDIP MELINEVMST ITDPAAMLGP ETNFLSAHSA 
481 RDEAARLEER RGVIEFHWG NSLNQKPNKK ILMWLVGLQN VFSHQLPRMP KEYITRLVFD 
541 PKHKTLALIK DGRVIGGICF RMFPSQGFTE IVFCAVTSNE QVKGYGTHLM NHLPCEYHIKH 
601 DILNFLTYAD EYAIGYFKKQ GFSKEIKIPK TKYVGYIKDY EGATLMGCEL NPRIPYTEFS 
661 VIIKKQKEII KKLIERKQAQ IRKVYPGLSC FKDGVRQIPI ESIPGIRETG WKPSGKEKSK 
721 EPRDPDQLYS TLKSILQQVK SHQSAWPFME PVKRTEAPGY YEVIRFPMDL KTMSERLKNR 
781 YYVSKKLFMA DLQRVFTNCK EYNAAESEYY KCANILEKFF FSKIKEAGLI DK 

Results 

The P/CAF bromodomain represents an extensive family of bromodomains (Figure 1). 
A large number of long-range nuclear Overhauser enhancement (NOE)-derived 
distance restraints were identified in the NMR data of the P/CAF bromodomain, 
yielding a well-defined three-dimensional structure (Figures 2 A -2D). Table 1 shows 
the NMR chemical shift assignment of the P/CAF bromodomain. Table 2 shows the 
Unambiguous NOE-derived distance restraints. Table 3 shows the Ambiguous NOE- 
deri ved distance restraints. Table 4 shows the Hydrogen bond restraints. The NMR 
structure coordinates of the P/CAF bromodomain in the free and complexed to acetyl- 
histamine are shown in Tables 5 and 6, respectively. 

The structure consists of a four-helix bundle (helices ^ai> ^b^ a^) with a 
left-handed twist, and a long intervening loop between helices and (termed the 
ZA loop. Figure 2E). The four amphipathic a-helices are packed tightly against one 
another in an antiparallel manner, with crossing angles for adjacent helices of -16-20°. 
The up-and-down four-helix bundle can adapt two topological folds with opposite 
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handedness (Figures 2F-2G). The right-handed four-heUx bundle fold occurs more 
commonly and is seen in proteins such as hemerythrin and cytochrome b^^2' The 
left-handed fold of the bromodomain structure is less common, but also observed in 
proteins such as cytochrome and T4 lysozyme [Richardson, J., Adv.Protein Chem., 
34:167-339 (1989); Presnell and Cohen, Proc. Natl. Acad. ScL USA 86:6592-6596 
(1989)]. This topological difference arises from the orientation of the loop between the 
first two helices (Fig. 2F-2G). The right-handed four-helix bundle proteins have a 
relatively short hairpin-like connection between the first two helices, which makes the 
"preferred" turn to the right at the top of the first helix [Richardson, J., Adv.Protein 
Chem., 34:167-339 (1989); Presnell and Cohen, Proc. NatL Acad. Sci. USA 86:6592- 
6596 (1989); Weber and Salemme, Nature 287:82-84 (1980)]. In contrast, proteins 
with the left-handed fold usually have a long loop after the first heUx and often contain 
additional secondary structural elements at the base of the helix bundle [Richardson, J., 
Adv.Protein Chem,, 34:167-339 (1989); Presnell and Cohen, Proc. NatL Acad, ScL 
USA 86:6592-6596 (1989)]. In the bromodomain structure, this long ZA loop has a 
defined conformation and is packed against the loop between helices and (termed 
the BC loop) to form a hydrophobic pocket. These tertiary interactions between the 
two loops appear to favor the left turn of the ZA loop, resuhing in the left-handed 
four-helix bundle fold of the bromodomain. The hydrophobic pocket formed by loops 
ZA and BC is lined by residues Val752, Ala757, Tyr760, Val763, Tyr802 and Tyr809 
(Fig. 2H), and appears to be a site for protein-protein interactions (see below). The 
pocket is located at one end of the four-helix bundle, opposite to the N- and C-termini 
of the protein. Interestingly, the ZA loop varies in length amongst different 
bromodomains, but almost always contains residues corresponding to Phe748, Pro751, 
Pro758, Tyr760, and Pro767 (Figure 1). The conservation of these residues within the 
ZA loop as well as residues within the a-heUcal regions imphes a similar left-handed 
four-helix bundle structure for the large family of bromodomains (Fig. 1). 

The modular bromodomain stmcture supports the idea that bromodomain can act as a 
fimctional unit for protein-protein interactions. The observation that bromodomains 
are found in nearly all known nuclear HATs (A-type) that are known to promote 
transcription- related acetylation of histones on specific lysine residues, but not present 
in cytoplasmic HATs (B-type), prompted the determination of whether bromodomains 
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can interact with acetyl-lysine (AcK). The NMR titration of the P/CAF bromodomain 
were performed with a peptide (SGRGKGG-acK-GLGK) derived from histone H4, in 
which Lys8 is acetyiated (Lys8 is the major acetylation site in H4 for GCN5, a yeast 
homologue of P/CAF). Remarkably, the bromodomain could indeed bind the AcK 

5 peptide. Moreover, this interaction appeared to be specific, based on the *^N-HSQC 
spectra which showed that only a limited nimiber of residues underwent chemical shift 
changes as a function of peptide concentration (Figure 3 A). Conversely, the NMR 
titration of the bromodomain with a non-acetylated, but otherwise identical H4 peptide, 
showed no noticeable chemical shift changes, demonstrating that the interaction 

0 between the bromodomain and the lysine-acetylated H4 peptide was dependent upon 
acetylation of lysine. The dissociation constant {K^) for the AcK peptide was 
estimated to be 346 ± 54 /^M. This binding is likely reinforced through additional 
interactions between bromodomain-containing proteins and target proteins. Notably, 
many chromatin-associated proteins contain two or multiple bromodomains (Figure 1). 

5 Indeed, binding with another lysine-acetylated peptide (RKSTGG-acK-APRKQ) 
derived from the major acetylation site on histone H3 (residues 9-20) was also 
observed. Together, these data demonstrate that the P/CAF bromodomain has the 
ability to bind AcK peptides in an acetylation dependent manner. 

0 Intriguingly, the bromodomain residues that exhibited the most significant *H and ^^N 
chemical shift changes on peptide binding are located near the hydrophobic pocket 
between the ZA and BC loops (Figure 3B). Because a similar pattern of amide 
chemical shift changes was observed with the two different AcK-containing peptides, 
it was surmised that the hydrophobic cavity is the primary binding site for AcK. This 

5 hypothesis was fiuther supported by titration with acetyl-histamine, which mimics the 
chemical structure of the AcK side-chain (Figure 3C). Both *^N- and *^C-HSQC 
spectra showed that interaction with acetyl-histamine was also acetylation-dependent, 
involving the same set of residues that showed chemical shift perturbations with 
similar concentration dependence. It should be noted that the bromodomain did not 

) bind to the amino acids acetyl-lysine or acetyl-histidine alone, possibly due to the 
presence of the charged amino, carboxyl, or caboxylate group adjacent to the acetyl 
moiety (Figure 3C). Taken together, these results strongly suggest that the P/CAF 
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bromodomain can interact with acetyl-lysine-containing proteins in a specific manner, 
and that this interaction is localized to the bromodomain hydrophobic cavity. 

To identify the key residues involved in bromodomain- AcK recognition, the NMR 
structure of the P/CAF bromodomain in complex with acetyl-histamine was elucidated. 
As anticipated, the acetylated moiety binds in the bromodomain hydrophobic pocket 
(Figure 4). The intermolecular interactions are largely hydrophobic in nature, with the 
methyl group of acetyl-histamine making extensive contacts with the side-chains of 
Val752, Ala757, and Tyr760, and the methylene groups of acetyl-histamine displaying 
specific NOEs to VaI752, Ala757, Tyr760, Tyr802, and Tyr809. No intermolecular 
NOEs were observed for the imidazole ring of acetyl-histamine. From the spectral 
analysis it is clear that the structure of the bromodomain is very similar in both the free 
and complex forms. 

It is worth noting that the bromodomain- AcK recognition is reminiscent of the 
interactions between the histone acety transferase Hatl and acetyl-CoA. Although the 
binding pockets of these two otherwise structurally unrelated proteins are composed of 
different secondary structural elements, the nature of acetyl-lysine recognition has 
striking similarities. In particular, Tyr809, Tyr802, Tyr760, and Val752 in the 
bromodomain appear to be related to Phe220, Phe261, Val254, and Ile217 of Hatl, 
respectively, in their interactions with the acetyl moiety. This observation may suggest 
an evolutionary convergent mechanism of acetyl-lysine recognition between 
bromodomains and histone acetyltransferases. 

To determine the relative contributions of residues within the hydrophobic cavity in 
bromodomain- AcK binding, site-directed mutagenesis was used to alter residues 
Tyr809, Tyr802, Tyr760, and Val752 (Table 7). 
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Table 7. Structural and Functional Analysis of the P/CAF Bromodomain 
Mutants 



5 


Bromodomain 
Proteins 


Structural Integrity ^ 


H4 AcK-Peptide Binding | 




Wild-Type 1 


MM 


346 ± 54 j 




Tyr809Ala 


MM 


No Binding*^ 


10 










Tyr802Ala 


-H-h 


> 10,000^ 




Tyr760Ala 


-H-h 


> 10,000 


15 


Val752Ala 


-H- 


> 10,000 



a. The effects of mutations on the structural integrity of the bromodomain were 
assessed by using the ^^N-HSQC spectra. The amide 'H/'^N resonances of the mutant 

20 proteins were compared to those of the wild-type bromodomain to determine if the 
particular mutations lead to global or local structure disruption. Severe 
line-broadening of the amide resonances would indicate protein conformational 
exchange due to a decrease of structure stability resulting from point mutations. 
Structural integrity of the mutant proteins is expressed here relative to that of the 

25 wild-type, using the signs of"i ill" for as stable as the wild-type, "-I-H-" for mildly 
destabilized, for moderately destabilized, and for completely unfolded. 

b. The ligand binding affinity (K^) of the bromodomain proteins was estimated by 
following chemical shift changes of amide peaks in the ^^N-HSQC spectra as a 

30 function of the ligand concentration. 

c. No detectable ligand binding observed in the NMR titration. 

d. Ligand binding affinity was significantly reduced and beyond the limit for reliable 
35 measurements by NMR titration. 
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Substitution of Ala for Tyr809 completely abrogated the bromodomain binding to the 
lysine-acetylated H4 peptide, while the Tyr802Ala, Tyr760Ala, and Val752Ala 
mutants had significantly reduced ligand binding affinity. To assess whether these 
mutations disrupted the overall bromodomain fold, the ^^N-HSQC spectra of the 
mutants was compared to that of the wild-type protein. For the Tyr809Ala mutant, the 
amide chemical shifts were only affected for a few residues near the mutation site. 
However, mutations of the other residues in the hydrophobic binding pocket perturbed 
the local protein conformation to greater extents, particularly the ZA loop (Table 7). 
Thus^ the NMR structural analysis and the mutagenesis studies show that Tyr809, 
which is structurally supported by Trp746 and Asn803 (Fiure 4), is essential for the 
bromodomain interaction with the acetyl group of acetyl-lysine, while residues of 
Tyr802, Tyr760, and Val752 likely play both structural and functional roles in the 
recognition. These residues are highly conserved throughout the bromodomain family 
(Figure 1), suggesting that recognition of acetyl-lysine may be a feature of 
bromodomains, in general. Therefore, Val752, Ala757, Tyr760, Tyr802, Asn803, and 
Tyr809 are key amino acid residues for the P/CAF bromodomain binding to acetyl- 
lysine. 



1 
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Table 8: Amino Acid Sequences of Bromodomains Identified in Figure 1 



PROTEIN 


SEQID 


GenBank 


PROTEIN 


SEQID 


GenBank 


BD 


NO: 


Acc. No. 


BD 


NO: 


Acc; No. 


hsp/CAF 


7 


U57317 


dmFSH-2 


25 




hsGCN5 


8 


U57136 


scBDFl-2 


26 




ttP55 


9 


U47321 


hsBRHO 


27 


JC2069 


SCGCN5 


10 


Q03330 


hsSMAP 


28 


X87613 


hsP300 


11 


A54277 


ggPBl-1 


29 


X90849 


hsCBP 


12 


S39162 


ggPBl-2 


30 




mmCBP 


13 


S39161 


ggPBl-3 


31 




ceYNJl 


14 


P34545 


ggPBl-4 


32 




hsCCGl-1 


15 


P21675 


ggPBl-5 


33 




msCCGl-1 


16 


D26114 


spBRO-1 


34 


S54260 


hsCCGl-2 


17 




spBRO-2 


35 




insCCGl-2 


18 




hsSNF2a 


36 


S45251 


hsRing3-l 


19 


P25440 


hsBRGl 


37 


S39039 


hsORPX-1 


20 


D26362 


ggBRM 


38 


X91638 


dmFSH-1 


21 


P13709 


ggBRGl 


39 


X91637 


scBDFl-1 


22 


P35817 


hsTIFlb 


40 


X97548 


hsRing3-2 


23 




minTIFlb 


41 


X99644 


hsORFX-2 


24 




mmTIFla 


42 


S78219 



The present invention is not to be limited in scope by the specific embodiments 
described herein, hideed, various modifications of the invention in addition to those 
25 described herein will become apparent to those skilled in the art from the foregoing 
description and the accompanying figures. Such modifications are intended to fall 
within the scope of the appended claims. 
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It is further to be understood that all base sizes or amino acid sizes, and all molecular 
weight or molecular mass values, given for nucleic acids or polypeptides are 
approximate, and are provided for description. 

Various publications are cited herein, the disclosures of which are hereby incorporated 
by reference herein in their entireties. 



Table 1 



NMR Chemical 
Shift Assignment 
of the P/CAF 
Bromodomain 



RES^ID 7 IS 

res'type GLY 

spin_system_id 1 

heterogeneity 100 
end_res_oef 

RES_ID 716 

RES^TYPE SER 

SPIN_SYSTEM_ID 2 

HETEROGENEITY lOO 
E5D_RES_DEF 

RES_rD 717 

RES_TYPE HIS 

SPIN_SYSTEM_ID 3 

HETEROGENEITY 100 
END_RES_DEP 

RES_ID 718 

RES_TYPE MET 

SPIM_SYSTEM_ID 4 

HETEROGENEITY 100 
END_RES_DEF 

RES_IO 719 

RES_TYPE SER 

SPIN_SYSTEM_ID S 

HETEROGENEITY 100 
END_RES_DEP 

RES_ID 720 

RES_TYPE LYS 

SPIN_SYSTEM_ID 6 

HETEROGENEITY 100 
CA 56.296000 
HA 4.361000 
CB 33.140000 
HBl 1.882000 
HB2 1.684000 



HETEROGENEITY 100 
N 121.192000 
KN 8.416000 
CA 63.430000 
HA 4.331000 
CB 30.930000 
HBl 1.81S0OO 
HB2 1.762000 
CG 27.630000 
HGl 1.681000 
CD 43.603000 
HOI 3.161000 

EKD_RES_DEP 

RES_IO 724 
RES_TYPE ASP 
SPIN_SYSTEM_ID 10 
HETEROGENEITY 100 
N 122.012000 
HN 8.273000 
CA S2.41SO00 
HA 4.874000 
CB 41.400000 
HBl 2.7S4000 
HB2 2.692000 
END_RES_DEF 

RES_ID 725 
RES"tYPE PRO 
SPIN_SYSTEM_ID 11 
HETEROGENEITY 100 
CA 65.080000 
HA 4.329000 
CB 32.590000 
HBl 2.326000 
HB2 1.973000 
CG 27.632000 
HGl 2.028000 
CO 51.310000 
HDl 3.866000 
END_RES_OEF 

RES_ID 726 
RES_TYPE ASP 
SPIN_SySTEM_ID 12 
HETEROGENEITY 100 

N 119.716000 

HN 8.397000 

CA 55.720000 

HA 4.692000 

CB 40.5S0000 

HBl 2.792000 



CA 62.320000 

HA 4.038000 

CB 38.640000 

HBl 3.211000 

H62 3.024000 

CDl 134.350000 

HDl 7.053000 

CEl 119.481000 

HEl 6.882000 
END_RES_DEP 

RES ID 730 
RES~TYPE SER 
SPIN_SYSTEM_ID 16 
HETEROGENEITY 100 

N 112.173000 

HN 8.167000 

HA 3.920000 

HBl 3.99SO00 
END RES OEF 



RES_ID 
RES~TYPE 
SPIN_SYSTEM_IO 
HETEROGENEITY 
N 120.372000 
8 .059000 
66.730000 
3 .924(^00 
68 .9^000 



HN 
CA 
HA 
CB 

HB 4.247000 
CG2 21.570000 
HG2lt 1.142000 
END RES OEF 



RES_ID 732 
RES~TYPE LEU 
SPIN_SYSTEM_ID 18 
HET^OGENEITY 100 
N 120.536000 
HN 8.460000 
CA 57.920000 
HA 3.289000 
CB 39.750000 
HBl 1.S32000 
HB2 0.294000 
CG 24.880000 
HG 1.683000 
COl 25.429000 
HDIK 0.469000 
CD2 19.921000 
HD2tf -0.193000 




CGI 28.733000 
HGll 1.748000 
HG12 1. 052000 
CG3 17.168000 
HG2tf 1.003000 
COl 13.863000 
HDltf 0.619000 
END_RES_DEP 

RES^ID 
RES~TYPE 
SPIN_SYSTDI_ID 
HETEROGENEITY 
N 119.880000 
8.841000 
58.473000 
4.090000 
41.950000 
2.090000 
1.703000 
27.330000 
1.759000 



736 
LEU 
22 
100 



HN 

CA 

HA 

CB 

HBl 

HB2 

CG 

HG, 



,CD1 26 .530000 
HDl» 1.061000 
C02 23.776000 
HD2# 0.977000 
EJID_RES_DEP 

RES_I0 737 
RES~TYPE CLN 
SPIN_SYSTEM_ID 23 
HETEROGENEITY 100 
N 117.256000 
HN 8.505000 
CA 59.020000 
HA 4.032000 
CB 28.182000 
HBl 2.327000 
KB2 2.263000 
CG 34.240000 
HGl 2.536000 
HG2 2.461000 
END_RES_DEP 

RES_ID 738 
RES_TYPE GUN 
SPIN_SYSTEM_ID 24^ 
HETEROGENEITY 100 
N lis. 896000 
HN 8.033000 
CA 59.574000 
HA 4.196000 



CG 25.430000 




HB2 2.730000 


END_RES_DEF 




CB 29.835000 




HGl 1.585000 




END_RES_OEF 






HBl 2.482000 




HG2 1.433000 






RES_ID 


733 


HB2 2.469000 




CO 29.834000 




RES_ID 727 


RES~TYPE 


LYS 


CG 35.342000 




HDl 1.703000 




RES_TYPE GLN 


SPIN_SYSTEM_ID 


19 


HGl 3.840000 




CE 41.960000 




SPIN_SYSTE«_ID 13 


HETEROGENEITY 


100 


HG3 2.467000 




HEl 3.003000 




HETEROGENEITY 100 


N 118.568000 




NE3 110.369000 


END_RES_OEF 




N 121.3S6000. 


HN 8.563000 




UE31 7.033000 






HN 8.196000 


CA 60.135000 




HE22 6.916000 


RES^ID 


721 


CA 55.920000 


HA 3.679000 




END_RES_DEF 




res'type 


GLU 


KA 4.163000 


CB 33.588000 








SPIN_SYSTEM_ID 


7 


CB 28.730000 


HBl 1.729000 




RES_IO 


739 


HETEROGENEITY 


100 


HBl 2.148000 


HB2 1.360000 




RES_TypE 


VAL 


N 122.990000 




CG 34.240000 


CG 24.880000 




SPIN_SYSTEM_ID . 


25 


HN 9. 317000 




HGl 2.534000 


HGl 1.280000 




HETEROGENEITY 


100 


CA 54.620000 




HG2 3.371000 


CD 39.835000 




N 119.716000 




HA 4.540000 




ENO_RES_OEF 


HOI 1.585000 




HN 8.526000 




CB 29.830000 






CB 41.960000 




CA 67.830000 




HBl 2. 024000 




RES_IO 728 


HEl 3.918000 




HA 3.844000 




HB2 1.893000 




RES^TYPE LEU 


END_RES_OEF 




CB 32.030000 




CG 35.893000 




SPIN_SYSTEM_IO 14 






HB 2.384000 




HGl 2.271000 




HETEROGENEITY 100 


RES_ID 


734 


CGI 23 .330000 


END_RES_DEF 




N 121.356000 


RES~TYPE 


SER 


HGin 1.183000 






HN 8.210000 


SPIN_SYSTEM_ID 


20 


CG2 22.120000 


RES_IO 


722 


CA 58. 473000 


HETEROGENEITY 


100 


HG2# 1.033000 


RES~TYPE 


PRO 


HA 4.045000 


N 113.157000 




END_RES_DEP 




SPIN_SYSTEM_ID 


8 


CB 41.400000 


HN 7.540000 








HETEROGENEITY 


100 


HBl 1.847000 


CA 61.337000 




RES_ID 


740 


CA 63.430000 




HB2 1.555000 


HA 4.381000 




RES~TYPE 


LYS 


HA 4.393000 




CG 27.080000 


CB 63.879000 




SPIN_SYSTEM_IO 


26 


CB 32.030000 




HG 1.480000 


HBl 4.060000 




HETEROGENEITY 


100 


HBl 2.224000 




CDl 25.970000 


ENO_RES_OEF 




N 114.633000 




HB2 1.880000 




HOlt) 0.794000 






HN 8.572000 




CG 27.630000 




C02 23.326000 


RES_ID 


735 


CA 59.574000 




HGl 2. 028000 




HD2t» 0.786000 


RES~TYPE 


ILE 


KA 3.886000 




CD 50.760000 




ENO_RES_DEF 


SPIN_SYSTEM_ID 


21 


CB 32.380000 




HD2 3.656000 






HETEROGENEITY 


100 


HBl 1.873000 




HOI 3. 800000 




RES_ID . 739 


N 130.700000 




HGl 1.033000 




END_RES_DEP 




RES~TYPE TYR 


HN 7.951000 




HDl 1.530000 








SPIN_SYSTEM_ID IS 


CA 65.080000 




END_RES_OEF 




RES_ID 


723 


HETEROGENEITY 100 


HA 3.786000 








RES~TYPE 


ARG 


N 119.060000 


CB 38.095000 




RES_ID 


741 


SPIN_SYSTEM_ID 


9 


HN 8.031000 


HB 1.879000 




RES~TYPE 


SER 



1 



SPIll_SYSTEM ID 27 
HETEROGENBITY 100 
N 110.369000 
HM 7.S57000 
CA S9. 034000 
KA 4.448000 
CB 63.960000 
HBI 4.004000 
END_RES_DEF 

RES ID 743 

res'type his 
spin system id 38 
heterogeneity 100 

M 125.619000 
HN 7.536000 
CA 58.473000 
HA 3.967000 
CB 32.588000 
HBI 3.990000 
HB2 2.799000 
CD2 118.930000 
HD3 4.978000 
*CE1 138.755000 
HEl 7.522000 
END_RES_DEP 

RES_IO 743 
RES~TYPE GLN 
SPIN SYSTEM_ID 39 
HETEROGENEITY 100 
N 138.571000 
HN 8.543000 
CA 59.125000 
HA 4.309000 
CB 39.834000 
HBI 3.111000 
CG 33.690000 
HGl 2.390000 
NE2 113.173000 
HE21 7.S81000 
HE23 6.870000 
END_RES_DEP 

RES_ID 744 

res'typb SER 

SPIN_SYSTEM ID 30 
HETEROGENEITY 100 

N 119.060000 

HN 11.668000 

CA 60.125000 

HA 4.838000 

CB €3.980000 

HBI 4 . 334000 

HB2 3.926000 
END_RES_DEF 

RES_ID 745 
RES TYPE ALA 
SPIN_SYSTEM ID 31 
HETEROGENEITY 100 

N 117.584000 

HN 7.868000 

CA 53.510000 

HA 4.396000 

CB 30.470000 

HB» 1.688000 
EKD_RES_DEP 

RES_ID 746 
RES_TYPB TRP 
SPIN_SYSTEM ID 33 
HETEROGENEITY 100 

N 116.600000 

HN 7,135000 

CA 60.691000 

HA 4.368000 

CB 37.630000 

HBI 3.594000 

HB3 3.351000 

CD! 138.843000 

HDl 7.897000 

NEl 110.861000 

HEl 10.474000 

CE3 123.234000 

HE3 7.336000 

C22 116.177000 

HZ2 7.382000 

CZ3 123.336000 

H23 7.197000 
CH2 126.089000 
HH2 7.150000 
END RES DEF 



RES ID 



747 



RES_TYPE PRO 
SPIN_SYSTEM ID 33 
HETEROGENEITY 100 
CA 64.531000 
HA 3.7S6000 
CB 29.835000 
HBI 0.487000 
HB2 -0. 783000 
CG 26.530000 
HGl 0.233000 
HG3 -0.931000 
CD 50.313000 
HD3 1.S67000 
HDl 2.177000 
ENO_RES_OEF 

RES_ID 748 

res'type PHE 

SPIN_SYSTEM 10 34 
HETEROGENEITY 100 

N 113.321000 

HN 7.S85000 

CA 55.719000 

HA 4.930000 

CB 39.303000 

HBI 3.491000 

HB3 3.S32O0O 

CDl 133.248000 

HDl 7.099000 

HEl 7.174000 

HZ 7.296000 
END_RES_DEF 

RES ID 749 
RES~TYPE MET 
SPIN_SYSTEM_ID 35 
HETEROGENEITY 100 

N 117.748000 

HN 7.115000 

CA 56.620000 

HA 4.386000 

CB 32.590000 

HBI 3.333000 

HB3 2.174000 

CG 33 .140000 

HGl 2.851000 

CE 17.168000 

HE# 2 .175000 
END_RES_DEF 

RES_ID 750 
RES_TYPE GLU - 

SPIN_SYSTEM ID 36 
HETEROGENEITY 100 

N 113.813000 

HN 7.709000 

CA 53.516000 

KA 4.849000 

CB 31.487000 

HBI 3.091000 

HB2 1.730000 

CG 35.893000 

HGl 3.164000 
END_RES_DEF 

RES_ID 751 
RES_TYPE PRO 
SPIN_SYSTEM_ID 37 
HETEROGENEITY 100 

CA 63.879000 

HA 4.343000 

CB 33.040000 

HBI 3.328000 

HB2 1.683000 

CG 27.080000 

HGl 3 .136000 

HG2 1.978000 

CD 50.763000 

HOI 3.670000 
END_RES_DEF 

RES_ID 752 
RES_TYPE VAL 
SPIN_SYSTEM_ID 38 
HETEROGENEITY 100 

N 134.450000 

HN 8.134000 

CA 63 .430000 

HA 3.553000 

CB 32.580000 

HB 1.145000 

CGI 21.573000 

HGlft 0.464000 

CG2 21.573000 

m2» 0 .169000 



END_RES_DEP 

RES_ID 753 
RES TYPE LYS 
SPIN_SYSTEM_ID 39 
HETEROGENEITY 100 
N 139.883000 
HN 9.045000 
CA 56.310000 
HA 4.370000 
CB 33.880000 
HSl 1.873000 
HGl 1.435000 
HDl 1.673000 
HBI 3.985000 
END_RES_DEP 

RES ID 754 
RES~TYPE ARC 
SPIN SYSTEM ID 40 
HETEROGENEITY 100 

N 130.308000 

HN 8.054000 
END_RES_DEF 

HES_ID 7SS 
RES_TYPE THR 
SPIN_SYSTEM ID 41 
HETEROGENEITY 100 
CA 63.43dO00 
HA 4.038000 
CB 68.380000 
HB 4.393000 
CG2 22.670000 
HG3tf 1.367000 
BND_RES_DEF 

RES_ID 756 
RES_TYPE GLU 
SPIN_SYSTEM ID 43 
HETEROGENEITY 100 

N 118.733000 

HN 7.309000 

CA 56.370000 

HA 4.448000 

CB 30.930000 

HBI 2.174000 

HB2 2.000000 

CG 36.440000 

HGl 3.292000 
END_RES_DEF 

RES_ID 757 
RES TYPE ALA 
SPIN SYSTEM_ID 43 
HETEROGENEITY 100 

N 123.504000 

HN 7.379000 

CA 50.320000 

HA 4.937000 

CB 19.370000 

HBtf 1.082000 
ENO_RES_DEP 

RES ID 758 
RES~TYPE PRO 
SPIN_SYSTEM_ID 44 
HETEROGENEITY 100 

CA 65.080000 

HA 4.496000 

CB 31.487000 

HBI 3.374000 

HB3 3 .027000 

CG 27.632000 

HGl 3.122000 

HG2 2.038000 

CD 50.313000 

HD2 3.515000 

HDl 3.717000 
END_RES_DEF 

RES_ID 759 

RES_TYPE GLY 

SPIfJ SYSTEM_ID 45 

HETEROGENEITY 100 
END_RES_DEF 

RES ID 760 

res'type tyr 
spin_system_id 46 
heterogeneity 100 

N 122.504000 
HN 7.945000 
CA 63.338000 
KA 3.536000 



CB 39.750000 
HBI 2.689000 
HB2 3.487000 
CDl 133.799000 
HDl S. 130000 
CEl 118.379000 
HEl 6.070000 
END_RES_DEP 

RES^ID 761 
RES_TYPE TYR 
SPIN^SYSTEM ID 47 
HETEROGENEITY 100 
N 113.157000 
HN 8.235000 
CA 60,676000 
HA 4,101000 
CB 37.550000 
HBI 3.189000 
HB2 2. 801000 
CDl 134,901000 
HDl 7.342000 
CEl 118 .930000 
HEl 6.646000 
ENO_RES_DEP 

RES_rD 763 
RES~TYPE GLU 
SPIN SYSTEM ID 48 
HETEROGENEITY. 100 
N 117.913000 
HN 7.703000 
CA 57,933000 
HA 4.309000 
CB 39.480000 
HBI 3.086000 
CG 37.545000 
HGl 3.335000 
HG3 2.265000 
END_RES_DEP 

RES_ID 763 
RES_TYPE VAL 
SPIN_SYSTEM_ID 49 
HETEROGENEITY 100 
N 115.453000 
HN 7.135000 ^ 
CA 63.430000 
HA 4.077000 
CB 33.690000 
HB 3.015000 
CGI 31.030000 
HG1# 1.045000 
CG3 21.574000 
HG2U 0.991000 
END_RES_DEF 

R£S_ID 764 
RSS~TYPE ILE 
SPIN_SYSTEM_ID SO 
HETEROGENEITY 100 
N 122.833000 
HN 7.947000 
CA 57,930000 
HA 3.916000 
CB 34.340000 
HB 1.305000 
CGI 34.878000 
HGll 0,798000 
HG13 0.216000 
CG2 16.617000 
HG2tf 0.380000 
CDl 9,457000 
HD1# 0.537000 
END_RES_DEP 

RES_ID 765 
RES_TYPE ARC 
SPIN SYSTEM ID 51 
HETEROGENEITY 100 

N 125.291000 

HN 7.749000 

CA 57.371000 

HA 3.875000 

CB 30.936000 

HBI 1.388000 

HB3 1.311000 

CG 37.08000f0 

HGl- 1.319000 

HG3 1.173000 

CD 43.053000 

HDl 3,971000 
END RES DEP 



ID 



766 



2 



RES^TYPS SER 


END_RES_DEP 


COl 35.439000 


SPIN_SYSTEM_ID 69 


SPIN_SYSTEM_ID 52 




HDltt 1.067000 


HETEROGENEITY 100 


HBTEROGSNSITY 100 


aES_ID 772 


CD3 27.061000 


N 115.780000 


V 600000 


R£S~TYPE THR 


H02# 0.671000 


HN 7.698000 


HN 8.387000 


SPIH_SYSTEM_ID 58 


END_RES_DEP 


CA 62.330000 


CA 54.616000 


HETEROGENEITY 100 




HA 4.083000 


KA 4.984000 


N 122.176000 


RES_ID 778 


CB 31.500000 


CB 38.640000 


HN 9.445000 


RES'tYPE LYS 


HB 3.331000 


HBl 3.034000 


CA 67.040000 


SPIN_SYSTEM ID 64 


CGI 31.570000 


HB2 3.907000 


HA 3.845000 


HETEROGENEITY 100 


HGin 0.944000 


END_RES_DEP 


CB 67.635000 


N 130.372000 


CG2 16 .820000 




HB 4.090000 


HN 7.958000 


HG3tt 0.833000 


RES_ID 767 


CG2 32.124000 


CA 59. 574000 


BND_RES_OEP 


RES^TYPE PRO 


HG2tt 1.058000 


HA 4.333000 




SPrN_SYSTEM_ID 53 


END_RES_DEP 


CB 32.588000 


RES_ID 784 


HETEROGENEITY 100 




HBl 3.055000 


RES_TYPE SER 


CA 63.439000 


RES_ID 773 


CG 34.878000 


SPIN_SYSTEM_ID 70 


HA 4.083000 


RES~TYPB MET 


HGl 1.596000 


HETEROGENE ITY 100 


CB 33.588000 


SPIN_SYSTEM_ID 59 


CD 39.835000 


N 111.353000 


HBl 3.309000 


HETEROGENEITY 100 


HDl 1.804000 


HN 7.415000 


CG 38.180000 


N 117.912000 


CE 41.951000 


CA 55.719000 


HGl 3 .177000 


HN 7.882000 


HEl 3.990000 


HA 4.741000 


HG2 1.883000 


CA 60.676000 


END_RES_DEF 


CB 66.183000 


•CO SO. 763000 


HA 4.319000 




HBl 4.300000 


HD2 3.390000 


CB 33.342000 


RES_ID 779 


HB3 3.750000 


HDl 3 .623000 


HBl 2.093000 


RES~TYPE ASN 


END_RES_OEP 


END_RES_DEP 


HB2 1.915000 


SPIN_SYSTEM_ID 65 






CG 33.139000 


HETEROGENEITY 100 


RES_ID 785 


RES 10 768 


HGl 2.631000 


N 116.108000 


RES~TYPE LYS 


RES~TYPB MET 


KG3 3.496000 


HN 7.947000 


SPIN^SYSTEM^ID. 71 


SPlii SYSTEM ID 54 


CE 16.630000 


CA 53.510000 


HETEROGENEITY 100 


HETEROGENEITY 100 


HE0 1.241000 


HA 4.771000 


CA 59.030000 


N 119.060000 


END_RES_DEP 


CB 38.095000 


HA 4.031000 


HN 8.430000 




HBl 3.019000 


CS 31.590000 


CA 54.067000 


RES ID 774 


HB3 3.773000 


ENO_RES_DEF 


HA 4.935000 


RES_TYPE SER 


ND3 112.665000 




CB 31.487000 


SPIN SYSTEM ID 60 


HD21 7.598000 


RES_IO 766 


HBl 1.989000 


HSTEROGENEITY 100 


HD22 6.969000 


RES^TYPE LYS 


HB2 1.353000 


N 116.108000 


END_RES_DEF 


SPrN_SYSTEM_ID 73 


CG 30.930000 


HN 7.958000 




HETEROGENEITY 100 


HGl 2.690000 


CA 62.879000 


RES ID 780 


N 120.208000 


CE 14.414000 


HA 4.200000 


RES TYPE ARG 


HN 8.344000 


HEtt 1.929000 


CB 62.879000 


SPIN SYSTEM ID 66 


CA 59.730000 


ENO_RES_DEF 


HBl 4.368000 


HETEROGENEITY 100 


HA 4.062000 




HB2 4.040000 


N 114.141000 


CB 30.385000 


RES ID 769 


END_RES_DEF 


KN 8.158000 


HBl 1.779000 


RES TYPE ASP 




CA 56.831000 


CG 24 .530000 ^ 


SPIN SYSTEM ID 55 


RES ID 775 


HA 4.405000 


CD 28.182000 " 


HETBROGENBITY 100 


RES TYPE GLU 


CB 35.439000 


HDl 1.680000 


N 119.060000 


SPIN SYSTEM ID 61 


HBl 3.097000 


CE 41.670000 


HN 7.365000 


HETEROGENEITY 100 


HB2 2.022000 


HEl- 3.137000 


CA 53.516000 


N 124.471000 


CG 27.633000 


HE2 3.045000 


HA 4.745000 


HN 8.150000 


HGl 1.539000 


END_RES_DEF 


CB 44.154000 


CA 59.570000 


HG3 1.S34000 




HBl 2.371000 


HA 4.045000 


CO 43.050000 


RES ID 787 


END_RES_DEF 


CB 39.280000 


HDl 3.060000 


res'type leu 




HBl 2.246000 


HD3 3.024000 


SPIN system id 73 


RES ID 770 


HB3 2.063000 


END_RES_DEF 


HETEROGENEITY 100 


RES TYPE LEU 


CG 36.443000 




N lia. 732000 


SPIN_SYSTEM ID 56 


HGl 2.345000 


RES ID 781 


HN 7.432000 


HETEROGENEITY 100 


HG2 2.176000 


RES_TlfPE TYR 


CA 57.933000 


N 116.272000 


END_RES_DEP 


SPIN SYSTEM ID 67 


HA 4.213000 


KN 9.055000 




HETEROGENEITY 100 


CB 43.603000 


CA 57.922000 


RES ID 776 


N 116.764000 


HBl 1.996000 


HA 4.036000 


RES TYPE ARC 


HN 8.222000 


HB2 1.691000 


CB 41.400000 


SPIN SYSTEM ID 62 


CA 60.135000 


CG 27.632000 


HBl 2.095000 


HETEROGENEITY 100 


HA 4.064000 


HG 1.794000 


HS2 1.395000 


» 120.372000 


CB 40.850000 


COl 25.979000 


CG 27.080000 


HH 8.391000 


HBl 2.948000 


HDltt 0.924000 


HG 1.713000 


CA 60.676000 


HB2 2.O5S000 


CD2 23.776000 


COl 27.080000 


HA 3.869000 


COl 134.350000 


H02ff 0.695000 


H01# 0.940000 


CB 30.385000 


HDl 6.285000 


END_RES_DEF 


CD2 22.675000 


HBl 3.047000 


CEl 116 .930000 




HD2# 0.628000 


HB3 1.076000 


HEl 6.709000 


RES_ID 768 


END_RES_OEP 


CG 29.284000 


END_RES_DEP 


RES~TYPE PHE 




HGl 1.722000 




SPIN_SYSTEM_ID 74 


RES_ID 771 


HG3 0.877000 


RES_ID 782 


HETEROGENEITY 100 


RES~TYPE LYS 


CD 44.154000 


RES_TYPE TYR 


N 118.732000 


SPIN^SYSTEM ID 57 


HOI 3.578000 


SPIN_SYSTEM ID 68 


HN 6.936000 


HETEROGENEITY 100 


B03 2 .051000 


HETEROGENEITY 100 


CA 60.676000 


N 128.079000 


END_RES_DEF 


N 114.633000 


HA 3.763000 


HN 8.738000 




HN 8.014000 


CB 39.750000 


CA 60.676000 


RES_ID 777 


CA 57.920000 


HBl 3.945000 


HA 4.198000 


RES^TYPE LEU 


HA 4.528000 


HB3 2.381000 


CB 32.037000 


SPIN_SYSTEM_ID 63 


CB 36.443000 


CDl 133.799000 


HBl 2 . 330000 


HETEROGENEITY 100 


HBl 3.062000 


HOI 6.400000 


HB2 2.324000 


N 130.306000 


HB2 2 .907000 


CEl 131.596000 


CG 25.380000 


HN 8.856000 


COl 133.348000 


HEl 6.928000 


HGl 1.463000 


CA 56.470000 


HDl 7.175000 


END_RES_DEP 


HG3 1.403000 


HA 4.691000 


CEl 130.583000 




CO 30.365000 


CB 43.621000 


HEl 7.366000 


RES^IO 789 


HDl 1.793000 


HBl 2.295000 


END_RES_DEP 


RES^TYPE MET 


HD2 1.696000 


HB3 1.935000 




SPIN_SYSTEM_ID 75 


CE 41.950000 


CG 37.060000 


RES_ID 783 


HETEROGENEITY 100 


HEl 2.965000 


HG 1.833000 


res'type val 


N 116.372000 
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HN S. 489000 

CA S9. 030000 

HA 3.911000 

CB 32.590000 

HBl 3.318000 

KB2 3.308000 

CG 33.140000 

HGl 3.942000 

HG2 2.611000 

CE 17.168000 

HE« 2.037000 
END_RBS_DEF 

RES_ID 790 
RES~TYPB ALA 
SPIN_SYSTEM_ID 76 
HETEROGBME ITY 100 
N 119.716000 
HN 8.000000 
CA 55.170000 
HA 4.084000 
CB 18.270000 
HBtt 1. 485000 
E!ro_RES_DEF 

RES_ID 791 

res'type asp 

SPrH_SYSTEM_IO 77 
HETEROGENEITY 100 
N 119.716000 
HM 7.376000 
CA 57.371000 
HA 4.371000 
CB 38.646000 
HBl 3. 730000 
END_RES_DEF 

RES_IO 792 
RES^TYPE LEU 
SPIN_SYSTEM_ID 78 
HETEROGENEITY 100 
N 119.550000 
KN 7.363000 
CA 57.933000 
HA 3. 398000 
CB 40.299000 
HBl 0.7S7O0O 
H63 0.442000 
CG 37.633000 
HG 0.707000 
CDI 24.327000 
H01# 0.184000 ' 
CD2 25.979000 
H02# 0.061000 
END_RES_DEF 

RES_ID 793 
RES_TYPE GLN 
SPIN SYSTEM_1D 79 
HETEROGENEITY 100 

N 114 .141000 

HN 8. 069000 

CA 59.024000 

HA 3.804000 

CB 28.733000 

HBl 2.157000 

HB2 2.097000 

CG 35.343000 

HGl 3.460000 

NE2 111.353000 

HE21 7.319000 

HE33 7.223000 
END RES OEF 



RES_IO 794 

RES^TYPE ARC 

SPlii_SYSTEM_ID 8 0 

HETEROGENEITY 100 



N 118.563000 
HN 7.382000 
CA 58.473000 
HA 4.078000 
CB 29.835000 
HBl 1.973000 
HB2 1.686000 
CG 27.080000 , 
HGl 1.742000 
CD 43.603000 
HDl 3.390000 
HD2 3.325000 
END_RES_DEF 

RES_rD 795 
RES_TYPE VAL 
SPIN_SYSTEM ID 81 



HETEROGENEITY 100 
M 117.912000 
HN 7.013000 
CA 66.730000 
HA 3.039000 
CB 30.930000 
HB 1.435000 
CGI 22.124000 
HG1» 0.479000 
CG3 31.573000 
HG2# 0.142000 

END_RES_DEF 

RES^ID 796 

res'type PHB 
SPIN_SySTEM_IO 83 
HETEROGENEITY 100 



N 


116.938000 


HN 


6.357000 


CA 


58.470000 


HA 


4.161000 


CB 


38.096000 


HBl 


3.090000 


HB3 


2.944000 


COl 


132.147000 


HDl 


6.641000 


CEl 


131.596000 


HEl 


6.456000 


CZ 


129.393000 


HZ 


6.406000 



END_RES_DEF 



RES_ID 797 
RES_TYPE THR 
SPIN_SYSTEM_ID 83 
HETEROGENEITY 100 
N 115.289000 
HN 9.047000 
CA 66.734000 
HA 3.838000 
CB 68.380000 
HB 4.210000 
CG2 22.120000 
HG2I» 1.296000 
END_RES_DEP 

RES_1D 798 
RES_TYPE ASN 
SPIN_SYSTEM_ID 84 
HETEROGENEITY 100 
N 120.700000 
HN 8.846000 
CA 55.170000 
HA 4.315000 
CB 38.090000 
HBl 2.985000 
HB2 2.661000 
END_RES_DEF 

R;ES_ID 799 
RES_TYPE CYS 
SPIN_SYSTEM_ID 65 
HETEROGENEITY 100 

N 116.928000 

HN 6.893000 

CA 62.157000 

KA 4.405000 

CB 26.S3O00O 

HBl 3.304000 

HB3 3.032000 
END RES DEP 



RES_ID 8 00 

RES^TYPE LYS 

SPIN_SYSTEM_ID 86 

HETEROGENEITY 100 



N 116.764000 
HN 7.799000 
CA 58.473000 
HA 4.304000 
CB 33.588000 
HBl 1.743000 
CG 35.439000 
HGl 1.313000 
HG3 0.138000 
CD 29,835000 
HDl 1.291000 
CE 41.400000 
HEl 2.486000 
HE2 2.42X000 
END_RES_DBF 

RES_ID 801 
RES~TYPE GLU 
SPIN^SYSTEM ID 87 



HETEROGENEITY 100 
N 117.912000 
HN 7.945000 
CA 57.992000 
HA 4.250000 
CB 30.365000 
HBl 2,172000 
HB2 2.003000 
CG 36.994000 
. HGl 2.407000 
HG2 2.203000 

END_RES_DEF 

RES_ID 802 
RES TYPE TYR 
SPIN_SYSTEM_ID 88 
HETEROGENEITY 100 
N 116.600000 
HN 7.744000 
CA 60.676000 
HA 4.369000 
CB 41.400000 
HBl 3.929000 
CDI 134.901000 
HOI 6.989000 
CEl 119.481000 
HEl 6.823000 
END_RES_DEF 

RES_ID 803 
RES^TYPE ASN 
SPIN^SYSTEM ID 8 9 
HETEROGENE ITY 100 
N 115.944000 
HN 8. 241000 
CA 51.864000 
HA 5.024000 
CB 40.849000 
HBl 3.069000 
HB2 3.907000 
ND2 118.732000 
HD21 8.316000 
HD22 7.809000 
END_RES_DBF 

R£S_IO 804 

RES~TYPE ALA 

SPIN_SYSTEM_ID 90 

HETEROGENEITY 100 
END_RES_DEF 

RES_ID 805 
RES_TYPE PRO 
SPIN_SYSTEM_IO 91 
HETEROGENEITY 100 

CA 63.980000 

HA 2.422000 

HBl 1.949000 

HGl 1.648000 

HG2 1.558000 

CO 50.762000 

HD2 3.601000 

HDl 3.706000 
ENO_RES_DEF 

RES_IO 806 
RES_TYPE GLU 
SPIN_SYSTEM_ID 92 
HETEROGENEITY 100 

N 113.993000 

HN 8.246000 

CA 56.820000 

HA 4.185000 

CB 28.733000 

HBl 2.095000 

HB2 1.973000 

CG 36.270000 

HGl 2.200000 
END_RES_OEP 

RES^ID 807 
RES]^TYPE SER 
SPIN_SYSTEM_ID 93 
HETEROGENEITY 100 

N 115.780000 

HN 8.112000 

CA 58.473000 

HA 4.406000 

CB 66.183000 

HBl 4.393000 

HB2 4.157000 
END_RES_DEP 

RES_rD 808 
RES~TYPE GLU 



SPIN_SYSTEM_ID 94 
HETEROGENE I TY 100 
N 123.488000 
HN 9.061000 
CA 59.574000 
KA 4.233000 
CB 29.835000 
HBl 2.169000 
CG 36.443000 
HGl 2.538000 
END_RES_OEP 

RES_ID 809 
RES~TYPE TYR 
SPIN SYSTEM_1D 95 
HETEROGENEITY 100 
N 116.436000 
HM 8.072000 
CA 60.120000 
HA 3.834000 
CB 37,550000 
HBl 3.018000 
HB2 2.738000 
CDI 132,698000 
HDl 6.891000 
CEl 120.032000 
HEl 7.011000 
END_RBS_DEF 

RES_ID 810 
RBS^TYPE TYR 
SPIN SYSTEM_IO 96 
HETEROGENEITY 100 
N 119.880000 
HN 7.356000 
CA 61.777000 
HA 3.819000 
CB 40.300000 
HBl 3.390000 
HB2 2.500000 
CDI 136.553000 
HDl 7.094000 
CBl 119.481000 
HEl 7.000000 
END_RES_DEF 

RES_ID 

RES_TYPE LYS 
SPIN_SYSTEM_ID 97 
HETEROGENEITY 100 

N 118.076000 

HN 8.072000 

CA 60.676000 

HA 4,204000 

CB 32.588000 

HBl 2.091000 

CG 25.979000 

HGl 1.819000 

HG2 1.583000 

CD 29.834000 

HDl 1.813000 

CE 41.963000 

HEl 3.963000 
END_RES_DEF 

RES_IO 813 
RES_TYPE CYS 
SPIN_SYSTEM_ID 98 
HETEROGENEITY 100 

N 116.764000 

HN 8.520000 

CA 65.087000 

HA 4.302000 

CB 27.080000 

HBl 3.396000 

HB2 3.056000 
END_RES_DEF 

RES_ID 813 
RES_TYPE ALA 
SPIN_SYSTEM_ID 99 
HETEROGENEITY 100 

N 120.700000 

HN 8.315000 

CA 55.563000 

HA 3. 834000 

CB 18.270000 

HB« 1.597000 
END_RES_DEF 

RES_ID 814 
RES_TYPE ASN 
SPIN_SYSTEM ID 100 
HETEROGENEITY 100 
N 115.453000 
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. KN 8.068000 

CA 56.370000 

HA 4.339000 

CB 35.646000 

HBl 3.077000 

HB3 3.834000 
END_RES_DEP 

RES ID 8 IS 

RES^TYPE ILE 
SPIN^SYSTEM ID 101 
HETEROGENEITY 100 
N 119.880000 
HN 7.913000 
CA 65.080000 
KA 3.646000 
CB 39.197000 
KB 1.924000 
CGI 39.384000 
HGll 1.883000 
HG13 1.301000 
CG3 17.718000 
HG3lf 1.017000 
* CDl 13.863000 
HDin 0.940000 
END_RES_DEP 

RES_ID 816 

res'type leu 
spin_system_id 103 
heterogeneity xoo 

N 133.504000 
KN 8.SS6000 
CA 56.820000 
HA 3.670000 
CB 41.951000 
HBL 1.405000 
HB2 1.199000 
CG 36.530000 
KG 1.580000 
CDl 24.327000 
HDltt 0. 701000 
CD2 35.429000 
HD2tt 0.696000 
BND_RES_DEP 

RES^IO 817 

res'type glu 
spin_system id 103 
heterogeneity 100 

N 130.700000 
HN 8.073000 
CA 60.135000 
HA 3-185000 
CB 29.835000 
HBl 1.720000 
HB2 1. 310000 
CG 37.545000 
HGl 2.001000 
HG2 1.932000 
END_RES_DEP 

RES^IO 818 
RES~TYPE LYS 
SPIN_SYSTEM ID 104 
HETEROGENEITY 100 

N 117.584000 

HN 7.145000 

CA 59.688000 

KA 4.075000 

CB 32.S88000 

HBl 1.929000 

CG 25.644000 

HGl 1.492000 

CD 29.284000 

HOI 1.681000 

CE 41.963000 

HEl 2 .964000 
END_RES_DEF 

RES_ID 819 
RES~TYPE PHE 
SPIN_SYSTEM ID 105 
HETEROGENEITY 100 

N 121.028000 

HN 7.869000 

CA 61.230000 

HA 4.328000 

CB 39.200000 

HBl 3.133000 

HS2 3.047000 

CDl 133.600000 

HDl 7.180000 
END RES OEF 



RES_XO 020 
RES_TYPE PHE 
SPINi_SYSTEM_IO 106 
HETEROGENEITY 100 
N 120.700000 
HN 9.126000 
CA 60.691000 
KA 3.961000 
CB 38.640000 
HBl 3.289000 
H83 3.067000 
CDl 133.248000 
HDl 6.904000 
CEl 133.698000 
HBl 7.011000 
BND_R£S_DEP 

RES ID 821 
RES^TYPE PHE 
SPIN^SYSTEM ID 107 
HETEROGENEITY 100 
N 118.076000 
HN 8,359000 
CA 61.770000 
KA 3.840000 
CB 38.090000 
HBl 3.064000 
CDl 133.248000 
HDl 7.175000 
CEl 133.698000 
HBl 7.294000 
CZ 131.596000 
HZ 7.430000 
END_RES_DBP 

RES_ID 822 
RES~TYPE SER 
SPIN_SYSTEM_ID 108 
HETSiOGENEITY 100 

M 114.961000 

HN 7.906000 

CA 61.773000 

HA 4.300000 

CB 63.879000 

HBl 4.007000 
END_RES_DBP 

RES_tD 833 
RBS_TYPE LYS 
SPIN_SYSTEM ID 109 
HET^OGENEITY 100 

H 130.864000 

HN 7.938000 

CA 56.820000 

HA 4.008000 

CB 31.487000 

HBl 1.730000 

HB2 1.567000 

CG 33.226000 

HGl 0.833000 

CD 27.080000 

HDl 1.403000 

CE 43.501000 

HEl 3.569000 

HE3 2.433000 
ENO_RES_DEP 

RES_ID 824 
REs'tYPE ILE 
SPIN_SYSTEM_ID 110 
HETEROGENEITY 100 

N 116.928000 

HN 8.101000 

CA 64.S30000 

HA 3.818000 

CB 36.990000 

KB 1.746000 

CGI 26.530000 

HCll 1.140000 

HG12 1,073000 

CG2 18.830000 

HG2i 0.654000 

CDl 13.312000 

HDll 0.541000 
END_RZS_DEP 

RES_ID 825 
RES_TYPE LYS 
SPIN_SYSTEM_ID 111 
HETEROGENEITY 100 

N 133,176000 

RN 7.546000 

CA 59.024000 

UA 4.043000 

CB 33,360000 



HBl 1.879000 
KB3 1.757000 
CG 24.878000 
KGl 1.390000 
HG2 1.303000 
CD 29.284000 
HOI 1.633000 
CE 41.400000 
HEl 2.913000 
END_RES,DEP 

RES ID S26 
RES_TYPS GLU 
SPIN_SYSTEM_rD 112 
HETEROGENEITY 100 
N 121.192000 
HN 8.063000 
CA 59,024000 
KA 3.99S000 
CB 29,834000 
HBl 2.OS800O 
CG 36 .050000 
HGl 3.342000 
HG2 3.20S000 
END_RES_DEP 

RES_IO 827 
RES~TYPE ALA 
SPIN_SYSTEM_ID 113 
HETEROGENEITY 100 
N 117.748000 
HN 7.620000 
CA 52.410000 
HA 4.291000 
CB 19.930000 
HBtf 1.3S8000 
END_RES_DEF 

RES_ID 838 

res'type GLY 

SPIN_SYSTEM_ID 114 
HETEROGENEITY 100 
N 126.767000 
HN 7.744000 
CA 45.903000 
HAl 4.019000 
HA3 3.935000 
END_RES_DEF 

RES_ID 829 
RES_TYPE LEU 
SPIN SySTEM_ID 115 
HETEROGENEITY 100 

N 117.912000 

HN 7.742000 

CA 55.719000 

KA 4.215000 

CB 43.052000 

HBl 1.563000 

CG 37.633000 

HG 1,536000 

CDl 23 .776000 

HD1# 0.711000 
END_RES_DEF 

RES_IO 830 
RES_TYPB ILE 
SPIN_SYSTEM_ID 116 
HETEROGENEITY 100 
N 115.453000 
HN 7.458000 
CA 60.676000 
HA 4.232000 
CB 39.748000 
KB 1.810000 
CGI 27.080000 
HGll 1.314000 
HG12 0.918000 
CG3 17.718000 
HG2# 0.815000 
CDl 13.312000 
HOm 0,794000 
END_RES_DEF 

RES^ID 831 
RES^TYPE ASP 
SPIN SYSTEM_IO 117 
HETEROGENEITY 100 

N 123.488000 

KN 8. 270000 

CA 54.620000 

HA 4.571000 

CB 41.400000 

HBl 2.693000 

HB2 3.540000 



END_RES_DBP 

RES.tO 833 
RES"tYPE LYS 
SPIH_SYSTEH_ID 118 
HETBROGSNBITY 100 
N 125.450000 
HN 7.774000 
CA 57.720000 
KA 4.082000 
CB 33.410000 
END RBS OEF 
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Table 4 



Hydrogen Bonding Restraints 



! Helix Z 

assign (residue 19 and name HN 

assign (residue 19 and name N 

assign (residue 22 and name HN 

assign (residue 22 and name N 

assign (residue 23 and name HN 

assign (residue 23 and name N 

assign (residue 24 and name HN 

assign (residue 24 and name N 

assign (residue 25 and name HN 

assign (residue 25 and name N 



! Helix B 

assign (residue 75 and name HN 
assign (residue 75 and name N 



(residue IS 

(residue 15 

(residue 18 

(residue 18 

(residue 19 

(residue 19 

(residue 20 

(residue 20 

(residue 21 

(residue 21 



(residue 71 
(residue 71 



and name 0 ) 
and name 0 ) 

and name O ) 
and name 0 ) 

and name 0 ) 
and name 0 ) 

and name O ) 
and name 0 ) 

and name 0 ) 
and name O ) 



1 . 80 
2.80 



and name 0 
and name O 



1.80 
2.80 



1.80 
2.80 



1.80 
2 .80 



0.0 0.40 
0.30 0.40 



1.80 0.0 0.40 
2.80 0.30 0.40 



1.80 0.0 0.40 
2.80 0.30 0.40 



0.0 0.40 
0.30 0.40 



0.0 0.40 
0.30 0.40 



0.0 0.40 
0.30 0.40 



(assign (residue 77 and name HN ) (residue 73 and name O ) 
! assign (residue 77 and name N ) (residue 73 and name O ) 



1.80 0.0 0.40 
2.80 0.30 0.40 



assign (residue 78 and name HN ) (residue 74 and name 0 ) 

assign (residue 78 and name N ) (residue 74 and name 0 ) 

assign (residue 7 9 and name HN ) (residue 75 and name O ) 

assign (residue 7 9 and name N ) (residue 75 and name O ) 



1.80 
2 .80 



0.0 0.40 
0.30 0.40 



1.80 0.0 0.40 
2.80 0.30 0.40 



! assign (residue 80 and name HN ) (residue 76 and name O ) 1.80 0.0 0.40 

! assign (residue 80 and name N ) (residue 76 and name O) 2,80 0.300. 40 

assign (residue 81 and name HN ) (residue 77 and name O ) 1.80 0.0 0.40 

assign (residue 81 and name N ) (residue 77 and name O) 2.80 0.300, 40 



assign (residue 82 and name HN 
assign (residue 82 and name N 



) (residue 78 and name O) 1.80 0.0 0,40 

) (residue 78 and name O ) 2.80 0.30 0.40 



! Helix C 

assign (residue 102 and name HN 
assign (residue 102 and name N 



(residue 98 and name O ) 
(residue 98 and name O ) 



1.80 
2.80 



0.0 0.40 
0.30 0.40 



assign (residue 103 and name HN 
assign (residue 103 and name N 



) (residue 99 and name 0 ) 
) (residue 99 and name O ) 



1.80 
2.80 



0.0 0.40 
0.30 0.40 



assign (residue 104 and name HN ) (residue 100 and name O ) 
assign (residue 104 and name N ) (residue 100 and name O ) 



1,80 0.0 0.40 
2.80 0.30 0.40 



assign (residue 105 and name HN ) (residue 101 and name 0 ) 1.80 0.0 0.40 

assign (residue 105 and name N ) (residue 101 and name O) 2.80 0.300. 40 
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